Psychometrik 





CONTENTS 


A SINGLE PLANE METHOD OF ROTATION - - - - 
L. L, THURSTONE 


THE TEST-RETEST RELIABILITY OF QUALITATIVE 
DATA <~«§ ©& © © © © © © = = = @ 
LOUIS GUTTMAN 


THE SIGNIFICANCE OF DIFFERENCE BETWEEN 
MEANS WITHOUT REFERENCE TO THE FRE- 
QUENCY DISTRIBUTION FUNCTION - - - 


LEON FESTINGER 


GENERAL SOLUTION OF THE ANALYSIS OF VARIANCE 
AND COVARIANCE IN THE CASE OF UNEQUAL 
OR DISPROPORTIONATE NUMBERS OF OBSER- 
VATIONS IN THE SUBCLASSES - - - - - 


FEI TSAO 


A NEW METHOD OF ANALYZING AESTHETIC PREFER- 
ENCES: SOME THEORETICAL CONSIDERA- 
TIONS - - - - = = = = = = = = = 

E. A. PEEL 








VOLUME ELEVEN JUNE 1946 NUMBER TWO 











PSYCHOMETRIKA—VOL. 11, NO. 2 
JUNE, 1946 


A SINGLE PLANE METHOD OF ROTATION 


L. L. THURSTONE* 
THE UNIVERSITY OF CHICAGO 


The method of rotation described here is applied to one hyper- 
plane at a time. The method seems to be simple and quite effective 
and it can be applied by a relatively inexperienced computer. The 
method does not postulate a positive manifold, and hence it is appli- 
cable also to bipolar factors. 


The rotation problem is for most computers one of the most dif- 
ficult in factor analysis. The reason is in the circumstance that most 
of the rotational methods require some experience before the graphi- 
cal representations can be handled effectively. It is our purpose in 
devising the present method to provide a procedure which can be 
handled by a clerk for most of the work. The method will be referred 
to as a single-plane method of rotation. The method seems to re- 
quire little more experience than is required to estimate the slope of 
a line through the origin for a set of points. Questions of indepen- 
dence of rotation are minimized in importance and the method seems 
to be convergent so that a hyperplane can be located without a dis- 
couraging amount of labor. The computer need not think about the 
summation of vectors nor about the distinction between a reference 
vector and its trace in the diagram. The method does not postulate 
a positive manifold, so that it can be used as well for locating posi- 
tive and bipolar factors. The method proceeds by locating one hyper- 
plane at a time so that the investigator can begin the more interest- 
ing job of interpretation of each factor while the remaining ones are 
being computed. This method has only recently been devised, and it 
has been tried on several sets of experimental data with promising 
results. 

While it is not necessary for a clerk-computer to understand the 
theory of the method, it will be described first and then the routine 
which can be explained to a novice in computing. 

In Figure 1 let Q represent a trial unit reference vector whose 
direction cosines are denoted Am,. The fixed orthogonal reference 

* The author wishes to express his indebtedness to the Social Science Research 


Committee of The University of Chicago for support of the Psychometric Labora- 
tory and these factorial investigations. 


71 








72 . PSYCHOMETRIKA 


axes are denoted by the subscript m as in previous factorial work. 
We shall denote the given trial reference vector Q and the adjusted 
trial reference vector P. This will be done for each trial so that the 
reference vector P that is obtained in one trial becomes the given 
trial vector Q for the next adjustment. This process continues for, 
perhaps, fivé er six adjustments until the reference vector seems to 
be located. If there is no simple structure or if the particular hyper- 
plane is not defined by the configuration, that fact becomes evident. 

It is first assumed in Figure 1 that the trial vector Q is orthogonal 
to one of the orthogonal reference axes M as shown, but we shall see 
later that this restriction can be overcome. The projections of the 
test vectors J on the trial vector Q are computed and denoted v;, (not 
shown on diagrams). We have then 


Viq= > Lm Amas (1) 
where a; are the direction numbers of the test vector J with respect 
to the fixed orthogonal frame M . 

Since M is one of the unit vectors of the fixed orthogonal frame, 
we already have the projections of the tests on this vector. They are 
the values a;,, in a column of the given factor matrix F. If M in the 
figure represents the first centroid axis, then the projections (JM,) 
are given in the first column of F. In the figure we have represented, 
by a set of points, the paired values from a column of F and the col- 
mun of v;,. The object is to draw a line through the origin with a 
slope so chosen that a large number of the points will lie close to the 
line. This has been illustrated in the figure and the slope s, is noted. 
It is simply the ordinate on the line TT fora=+1. 

Let L be a vector so defined that (1) it has unit projection on Q 
and (2) it is orthogonal to the line TT. The vector L can be expressed 


L=Q-—sM. (2) 
Let l,,, be the direction numbers of L. Then we have 
lng = Amq — 8 ’ (3) 


since the vector M is a unit vector on the co-ordinate axis M. The 
new vector A, will be collinear with L and of unit length. 

Let us now consider the oblique case which is illustrated in Fig- 
ure 2. The trial vector Q has the projection 4m, on the vector M as 
shown in the figure. The projection Am is then one of the direction 
cosines of Q , and M is in one of the fixed orthogonal axes. When the 
values of v;, and @;,, are plotted on orthogonal cross-section paper, the 
diagram will look like Figure 1. The vector —sM is shown in Figure 








L. L. THURSTONE 73 


2, where it is of course parallel to M. Here we write 


U=Q-—sM, (4) 

where U satisfies only one of the requirements for L. The vector L 

is orthogonal to T7, and so is the vector U. But the vector L is also 

defined as having unit projection on Q. The vector U can be extended 

by a stretching factor k so that it has unit projection on Q. Then we 
have 

L=kU (5) 


as shown in Figure 2. In order to determine the stretching factor for 
U to give it unit projection on Q we write 


k(UQ) = 1, (6) 
so that 
issicane (7) 
(UQ)’ 
where UQ is a scalar product. From (4) and (7) we have 
(Q—sM)Q 
or 
1 
k=—_——__ , 
Q?—sMQ (9) 
and since Q is a unit vector we have 
1 
k=———_.. 10 
1—sMQ ia 


But the scalar project MQ is merely the direction cosine 4», for the 
axis M. Hence we write 
1 


= _ 11 
1—~— 8 dig = 





The direction number of Q for the axis M is A», as shown in the 
figure. Since the vector —sM is parallel to M , the direction number 
of the vector U for the axis M is 


tng = Amq— 8. (12) 
The direction number of the vector L for the same axis is J. and it is 
bas Se k Uma (13) 


so that the desired direction number can be written in the form 








74 PSYCHOMETRIKA 


law a... ind (14) 
ft Bilan 
which is a simple computing formula. Here the 4,,, are the direction 
cosines of the given unit trial vector Q. The slope s is read directly 
from a graph. If there are r columns of the factor matrix F,, then 
there are as many graphs to be plotted. Each one gives a slope value 
Sm, Which is set equal to zero if no rotation is indicated. 
When formula (14) has been evaluated for each of the orthog- 
onal axes M the new vector L is determined. It is then normalized 
and becomes the new trial vector P. 


A Numerical Example 

A three-dimensional example will be used to illustrate this meth- 
od of rotation numerically. Table 1 shows a factor matrix F for 
twenty variables and three orthogonal factors. The problem is to lo- 
cate the reference axes A, which define a simple structure. 

The procedure starts with the selection of a test vector. In the 
present example test 15 was arbitrarily chosen as a starting vector. 
Its direction numbers are shown in row 15 of F in Table 1. These three 
direction numbers, a;,,, are recorded in the first column of Table 2, 
Trial 1. This vector is normalized and it then becomes the first 
unit vector A, with direction cosines 2,,, aS shown in the second 
column. The computations for normalizing the given test vector 
are shown in part in the table. We have then Sa? = .9173 as shown 


and 1/\/>a@* = ¢, which is the multiplying factor by which 


C Ajm = Aiae = 
where c = 1.0441. 

The projections v;, are then computed and recorded as shown 
under Trial 1 in Table 1. The column v;, for the first trial is plotted 
against each of the three columns of the factor matrix F, and the 
three resulting diagrams are shown in the top row of Figure 3. 

The work of plotting the diagrams can be considerably reduced 
by not plotting the points for which a is smaller than v. These points 
have rarely any effect in determining the slope of the desired line 
through the origin, and they may as well be omitted in plotting the 
diagrams. 

When al] the planes with appreciable variance have been deter- 
mined, it is advisable to make a final set of diagrams of pairs of col- 
umns of the oblique factor matrix V. One should expect then only 
small adjustments. However, one should expect to find occasionally, 
especially with an indeterminate structure, that an appreciable angu- 











L. L. THURSTONE 75 


lar adjustment will be made in the final set of diagrams. When the 
structure is overdetermined, the present method should give the so- 
lution without any adjustment in the final set of diagrams where 
pairs of columns of the oblique matrix V have been plotted. 

On each diagram a line is drawn through the origin so that it goes 
through groups of points wherever possible. Large angular deviations 
of 30° to 45° frem the horizontal are not made unless the configura- 
tion of points clearly demands so large an angular deviation from the 
X-axis. The slope s, of each line is noted, and it is recorded in the 
next column of Table 2. The slope of each line is read graphically. 
It is read directly from a vertical line through the point +1 on the 
base line. The rest of the calculations for Trial 1 are self-explanatory 
as shown in Table 2. The column l,,, is normalized and then becomes 
the column j,,,. The calculations show >/? = 1.2122, which gives the 
multiplying factor ¢c = .90827. The column j,,, shows the direction 
cosines of A, , which is the trial vector A, for the next trial. 

When the direction cosines 1,,, of the next trial vector have been 
determined, it is useful to compute the scalar product of the given 
vector Q and the new unit vector P. This is the cosine of the angular 
displacement ¢ represented by the trial in question. The correspond- 
ing value of sin ¢ is recorded as shown. For the first trial sin ¢ = .28. 
It will be found that trials should be continued until sin ¢ is about .10 
or less. In the present example it is found that the second trial gives 
sin ¢ = .11, which is smali enough so that two adjustments are con- 
sidered sufficient in this case. 

In the second section of Table 2 these direction cosines are copied 
in the column id, , as shown. The projections v;, are computed and 
recorded in Table 1 in the column for Trial 2. This column is plotted 
against the columns of the factor matrix F, and we then have the 
three diagrams in the bottom row of Figure 3. Here the groups of 
points are closer to the horizontal line in all diagrams, a fact which 
shows that the desired solution is being approached. A line through 
the origin is drawn on each of these three diagrams which goes 
through the concentrations of points. The slopes s,, are recorded in 
Table 2 and the rest of the computations proceed as before. 

The value of sin ¢ = .11, which is small enough so that further ro- 
tation is not indicated. The projections v;, of the test vectors on the 
reference vector determined in two trials are computed and recorded 
in the last column of Table 1. This last column is, in fact, one column 
of the oblique factor matrix V. 

In locating the next reference vector we start with one of the 
test vectors which lie in or near the plane already determined. We 
choose therefore one of the vectors that have small values in the last 








76 PSYCHOMETRIKA 


column of Table 1. We might start with any one of the vectors 1, 2, 
4, 7, 10, 13, 14, 18, 19. These would not all lead to the same plane, 
but it does not matter in which order the planes are determined. The 
starting vector for determining the third plane is chosen from those 
test vectors which have small projections on both the first and second 
reference vectors A,. In this manner one avoids starting with a vec- 
tor that leads to a plane that has already been determined. If the con- 
figuration is indeterminate, the fact becomes evident in that there is 
no clear indication of where to draw the lines on the graphs or the 
adjustments may require large deviations that carry the new trial 
vector into one of the reference vectors that have already been deter- 
mined. 

It has been found that inexperienced student assistance can be 
used on these computations, most of which can be arranged with check 
columns. The check sums have not been shown here for Table 2. It 
is a fortunate circumstance that an error in these calculations is self- 
correcting in that the adjustment is made by the computations in the 
next trial. It is best to provide rather complete check columns so that 
each step is self-checking. 

The present method has been tried on several large factor stud- 
ies, on smal] test batteries, and on fictitious examples, and it has been 
found so far to be rapidly convergent to the solution. One of its ad- 
vantages seems to be that the computer need not worry about the 
independence of the several angular adjustments in each trial because 
each of them affects only one of the direction cosines of the new trial 
vector. The difficulty caused by overcorrection does not seem to be 
a problem with this method. 








L. L. THURSTONE 77 

















TABLE 1 
Factor Matrix F Plane A 
Trials 
I II III i z 3 
1 .66 —.74 14 18 —.04 01 
2 By f° 18 —.66 oak. —.03 02 
3 .66 54 .50 .99 94 96 
4 87 —.21 —.44 20 —.04 03 
5 .83 18 51 .92 82 85 
6 84 52 15 91 77 81 
7 .86 —.45 —.27 19 —.07 00 
8 .85 —.43 2 51 34 40 
9 86 42 —.30 62 42 47 
10 88 —.34 —.35 22 —.05 02 
a 89 —.15 44 75 60 65 
12 88 48 —.09 78 60 65 
13 67 —.72 at 13 —.05 00 
14 72 25 —.62 at 04 09 
15 63 50 52 96 92 94 
16 94 26 16 84 67 72 
17 97 —.24 —.08 47 23 30 
18 62 —.72 AT 13 —.03 02 
19 70 11 —.65 17 —.07 —.02 
20 66 54 49 98 94 96 
= 15.71 —.02 .05 10.41 6.98 7.88 
TABLE 2 


Test 15 Q 


im mq & AS SrA I-sdX IL d za2@=— .9173 c= 1.0441 
—— —= B= 1.2122 90827 
Trial 1 I 63 66 25 41 16 84 49 44 poss g— 9592 
II .50 52-18 .70-09 1.09 .64 58 pint 
III .52 .54-35 89-19 119 .75 .68 











Trial 2 I .44 -07 51 —03 1.08 .49 .50 21? = 9651 e— 1.0179 
II 58 05 53 08 97 .55 .56 PQ = cos ¢ = .9936 
III 68 .05 .68 08 .97 .65 .66 sin ¢= .11 














@ sandy 


T aun 








PSYCHOMETRIKA 
ae ae 
e 
9 














78 








L. L. THURSTONE 


79 









































ee) So 
. q 
: 2k 
P) 
« Pg 
io 
opo ° 
N . 
Ss So 
is “a 
- et 
° *P,, 
ss 
N N 
bd 0 ae Sea 
° - ° 
* “Be “3 




















FIGURE 3 














PSYCHOMETRIKA—VOL. 11, NO. 2 
JUNE, 1946 


THE TEST-RETEST RELIABILITY OF QUALITATIVE DATA 


LOUIS GUTTMAN 
CORNELL UNIVERSITY 


The test-retest reliability of qualitative items, such as occur in 
achievement tests, attitude questionnaires, public opinion surveys, 
and elsewhere, requires a different technique of analysis from that 
of quantitative variables. Definitions appropriate to the qualitative 
case are made both for the reliability coefficient of an individual on 
an item and for the reliability coefficient of a population on the item. 
From but a single trial of a large population on the item, it is pos- 
sible to compute a lower bound to the group reliability coefficient. 
Two kinds of lower bounds are presented. From two experimentally 
independent trials of the population on the item, it is possible to com- 
pute an upper bound to the group reliability coefficient. Two upper 
bounds are presented. The computations for the lower and upper 
bounds are all very simple. Numerical examples are given. 


1. Introduction 

An item is said to be unreliable for a person to the extent to. 
which his response to it would vary in repeated experiments under 
the same conditions. For quantitative items, the method of studying 
the variation has been to use arithmetic means as the “true” re- 
sponses and variances as measures of dispersion. But if an item is 
qualitative, the only average (of those commonly in use) that it can 
have is the mode, and a measure of variation is the relative frequency 
of the non-modal values.* Therefore, the study of test-retest reli- 
ability of qualitative items requires a different technique from that 
of quantitative items. The present paper is devoted to presenting 
appropriate definitions and technique for the qualitative case. 

Qualitative items are of considerable importance for the social 
and psychological sciences. Questions in an achievement test. are quali- 
tative dichotomies when answers are either “right” or “wrong.” If 
numerical values are assigned these two categories and are added up 
over a set of items to yield a total test score,} then the reliability of 

* For a discussion of the distinction between qualitative and quantitative 
variables with respect to prediction and correlation, see (2). An adaptation of 
(2) is also available in Guilford’s textbook (1, Chap. 10). 

+ The problem of whether a total score is meaningful is not raised here; that 


is the problem of scale analysis (3). The analysis of test-retest reliability re- 
mains the same, regardless of the meaning of items or total scores. 


81 








82 PSYCHOMETRIKA 


the total score should, of course, be studied by the technique of quan- 
titative variables; but this does not change the fact that the original 
items were qualitative and can have their separate reliabilities stud- 
ied by the technique to be developed here. In attitude and opinion 
research, the reliability of each qualitative item by itself is often of 
interest, so that the qualitative technique is especially appropriate 
in this field. In general, the technique is appropriate in any study 
where qualitative data are enumerated, be they census returns, the 
eye colers of fruit flies, commercial commodities, etc., etc. 

In Part I of this paper is given a description of the definitions 
used, and illustrations of applications of the qualitative technique are 
presented. The actual derivations and proofs are developed in Part II. 


PART I: APPLICATIONS 


2. Definitions and Terminologu 

Consider a universe of indefinitely many trials of a person on an 
item which has m categories. By the probability that the person will 
fall in a particular category, we mean the proportion of trials in 
which this will happen for him. The category with the highest prob- 
ability will be called the modal value for the person, and the highest 
probability itself will be called the modal probability. If the modal 
probability is attained by two or more categories, then any one of 
these can arbitrarily be selected as the modal value. The modal value 
plays the role in the qualitative theory that the expected or mean 
value plays in the quantitative theory; and the complement of the 
modal probability plays the role of the error variance. 

If all the categories have the same probability for the person 
(this common probability is then the modal probability and is equal 
to 1/m), then the item will be said to have zero reliability for that 
person. If the modal probability is unity (the non-modal categories 
each having zero probability) then the item is said to have perfect 
reliability for the person, or a reliability of unity. Let P; be the modal 
probability for the i-th person. The reliability coefficient of the item 
for that person will be defined as 


m 1 
R; = ( P; aan > 
m—1 m 
R; varies between zero and unity according as P; varies between 1/m 
and unity. 
The mean of P; over all people in the population will be denoted 
by a; this is the mean modal probability for the population. The refi- 














LOUIS GUTTMAN 83 


ability coefficient of the item for the population is defined to be 


m 1 42 
sealed) 
—1 m 


p is actually the mean of the R;; that is, the population renner, of 
the item is the mean of the individual reliabilities. 

If p is zero, then all* of the R; must be zero; and if p is unity, 
then all* of the R; must be unity. Zero group reliability implies zero in- 
dividual reliabilities, and perfect group reliability implies perfect indi- 
vidual reliabilities. An intermediate group reliability coefficient is a 
value around which the individual reliabilities can cluster with vari- 
ous degrees of dispersion; it is not assumed that all individuals are 
equally reliable. It should be clear, though, that the closer p is to 
zero or to unity, the greater the restraint to the variance of the R; 
about p. 

The experimental problem is to obtain information about the 
R;. To obtain this directly would require actual repeated experi- 
ments under the same conditions, which is a difficult, if not impos- 
sible, thing to do empirically in many cases. But fortunately it is 
possible to obtain information about p from only a single trial. Lower 
bounds to p can be determined from only a single trial, so that we 
can say the group reliability of the item is at least a certain amount. 

The notion of a lower bound to a reliability coefficient was intro- 
duced in (4) for the case of quantitative variables. To derive a 
lower bound from a single trial for the quantitative case requires 
either that (a) the variable be the sum of at least two experimentally 
independent items,} or that (b) the item be correlated with one or 
more experimentally independent items.t 

Condition (a) of course cannot be met by a qualitative item and 
has no analogue for the qualitative case. Instead, we have the far 
simpler phenomenon that only the frequency distribution of the popu- 
lation on the item itself is needed to determine a lower bound, as is de- 
scribed in the next section. An analogue to condition (b) does hold 
and provides the lower bound based on sub-frequencies described 
in §4. 

If two experimentally independent trials are actually made for 
an item, then also an wpper bound can be computed for the reliability 
coefficient, as is illustrated in §5. 

In order to obtain a lower bourd from a single trial, or an up- 
per bound from two trials, it is assumed here that the population of 

* Except possibly for an infinitesimal proportio n of the people. 


+ This is true for the first five bounds in (4). 
t+ See the theorem in §17 of (4) on which the sixth bound is based. 








84 PSYCHOMETRIKA 


people is indefinitely large. If the population is finite or if a sample 
from an indefinitely large population is used for the computations, 
then the bounds derived here will of course be subject to sampling 
error. The sampling theory of the bounds has not yet beén completely 
worked out, but it seems clear that they should be quite reliable when 
computed from samples of the size ordinarily used in public opinion 


polls. 


3. The Marginal Lower Bound 
Let A, , A2,--- , Am be the relative frequencies of the respective 
m categories of the item on a single trial of the population. In §7 
and §10 below it is proved that* a is not less than the largest of the 
A’s. For example, suppose each person in a large population were 
asked if he agreed to a certain political proposition, and suppose the 
results were 
yes 60% 
undecided 15 
no 25 





100% 


Here, m = 3; the largest A is .60; so we can state that a = .60; and 
we can further state that 


m if 3 1. 
p= (a= j= f=), 
m—l1 m 2 3 


If the responses had been: 





or 


yes 33 1/3% 
undecided 33 1/3 
no 331/3 


100% 
then we would state that a = .33 1/3 and that 
0=p=1, 
which provides no information about p. The responses can have any- 
thing from perfect reliability to zero reliability if the population is 


equally divided over the m categories. But if there are unequal! fre- 
quencies, then there must be at least some group reliability. 


* Except possibly for an infinitesimal proportion of the trials. 














LOUIS GUTTMAN 85 


Such a lower bound, based on marginal frequencies, cannot ap- 
proach unity, of course, unless the population piles up in one cate- 
gory. For example, if, on a true-false question, 90% of a large class 
of students answered correctly (or incorrectly), then for that ques- 
tion we can state that 

8ZpSl. 


It is of interest to notice that, for a fixed maximum marginal, 
the lower bound increases with the number of categories. If 90% of 
a large class of students answered a multiple-choice question correct- 
ly, where there were four choices, then for that question the marginal 
lower bound would be .87, which is larger than the bound of .8 for 
the true-false question in the preceding paragraph. Similarly, a mul- 
tiple-choice question which was answered correctly by 50% of the 
population can be said to have some reliability, whereas nothing can 
be said about the reliability of a true-false question which 50% an- 
swered correctly, if this per cent is the only information available. 

Many an item will actually have a reliability coefficient that is 
far greater than its marginal lower bound, so that it is desirable to 
seek a more efficient lower bound. A better bound is described in 
the next section. 


4. The Lower Bound From Joint Occurrences 


By relating the item to another experimentally independent item 
or set of items, it is possible to improve on the marginal lower bound. 
It is proved in §8 and §10 below that o for an item is* not smaller 
than the sum of its largest sub-frequencies (expressed as propor- 
tions) in the joint occurrence table of that item with any other item 
obtained from a single trial, provided the table is based on a large 
population of individuals and the items are experimentally indepen- 
dent. 

As an example, consider the joint occurrences in a single trial of 
two items, U and V, for a large population. U has three categories: 
U,, U., and U;; and V has two categories: V, and V;. Suppose the 
joint occurrences in a single trial are as follows: 








U, Us U; 
V, | 40] .20/| — 60 
V, — | .10 | .30 40 














40 380 80 38 1.00 


* Except possibly for an infinitesimal proportion of the trials. 











86 PSYCHOMETRIKA 


V has three columns of sub-frequencies, one for each value of U. The 
largest sub-frequencies in each column are, respectively, .40, .20, and 
.30. Then we can state that a for V is not less than 


40 + .20 + .30=.90, 


2 1 
pz=( 90-5), 
1 2 


or p = .80. This is a great improvement over the marginal lower 


bound for V, which is 


and that 


Z 1 
:( w= )=.20. 
1 : 

Similarly, we obtain a better lower bound for U from the same 
data. From the table of joint frequencies, we can state that a for U 
is not less than the sum of the largest sub-frequencies of U: 


40 + .30=.70, 


3 1 
p25 ( 70-5 ) 
2 3 


or p = .55. This is not extremely high, but still it is an improvement 
over the marginal lower bound for U, which is 


3 1 
=( a ) =.10. 
2 3 


If a lower bound obtained from a bivariate table is not very high 
and if it is believed that an item is much more reliable than this lower 
bound, then it may be worth while to seek a better lower bound by 
considering relationships with more than one other item, provided 
the additional items are also experimentally independent of the one 
whose reliability is sought. The sum of the highest sub-frequencies 
of the “Bin the multivariate table is* again a lower bound to a. 


so that for U, 


In ticular, if an item is perfectly related to anv set whatso- 
ever of items experimentally independent of it, then we can state that 
the item is perfectly reliable. 


* Except possibly fer an infinitesimal proportion of the trials. 

















LOUIS GUTTMAN 87 


5. The Reliability Coefficient and Two Independent Trials 

From a single trial, only a lower bound can be established for 
the group reliability coefficient. But if it is possible to make two ez- 
perimentally independent trials of the same item, then it is possible 
also to set an upper bound to the group reliability coefficient. In prac- 
tice, the lower and upper bounds will often be very close to each 
other, so that the coefficient will be determined within a relatively 
small interval. [In the quantitative case, it may be remembered, two 
trials suffice to determine* the reliability coefficient exactly. See (4), 
267-268. ] 

Two upper bounds are established in §9 below, the second being 
somewhat better than the first, but involving slightly more arithmetic. 
As an example of the use of the bounds, consider a trichotomous item 
repeated twice—but experimentally independently—on a large popu- 
Jation, with the following results: 


Second Trial 











Us Uz U; 
| | 
U, | .10 | .15 | .05 .30 
First | pre 
Trial U2 15 | and .05 45 
| | 
U; | .05 | .05 | .15 .25 
| | 











30 45 25 1.00 


The joint occurrence table must* be symmetric (indeed, it must* be 
Gramian) for two independent trials, if the population is very large. 

A lower bound for « is obtained as in §4, by adding up the high- 
est sub-frequency in each row (column): 

a= .15 + .25 + .15=—.55. 

It is important to notice that a is not the same as the average prob- 
ability of remaining in the same category on two trials. If y is the 
average probability of remaining in the same category on two trials, 
then, as is shown in §9 and §10 below, y can be set equal to the sum 
of the principal diagonal elements: 


y=.10 + .25 + .15=.50. 


a is always greater than y unless both are equal to unity (in which 
case p = 1). This inequality is true even if all the largest sub-fre- 


* Except possibly for an infinitesimal proportion of the pairs of trials. 








88 PSYCHOMETRIKA 


quencies are in the principal diagonal. 
A simple upper bound for a is the square root of y: 


avy. 
Hence, for our example we can state that a = .71. A better upper 
bound is 





1+ V¥(m—1)(my—1) 
m 3 


c= 





where m is the number of categories in the item. For the present 
example m = 3 and we can set y = .50, so we can state that 


g=i67 . 


Hence, combining both the lower and upper bounds into one state- 
ment: 
55 <a = .67, 
which implies 
38 Sp 50. 
As a final comment, it should be noted that p is bounded from 
two trials, even though an individual R; cannot be estimated very 


well. A larger series of trials on a particular person would be neces- 
sary to estimate his R; . 


PART II: DERIVATIONS 


6. Analytical Definitions 
Consider an item U with the m categories U,, U.,+---, Um. An 
indefinitely large population of individuals is supposed to be given 
indefinitely many trials on the item. In this section and in §7, we 
assume nothing about experimental independence of the trials. The 
definition of, and lower bounds for, the reliability coefficient of U 
will be established in terms of parameters defined over all persons 
and trials. That the lower bounds can be observed from a single trial 
is shown in §10, using the hypothesis of experimental independence. 
The response of each person to the item on each trial can be rep- 
resented by a;;,, where 
1 if individual i is in U; on trial k 
Qijx = F (1) 
0 otherwise. 
The probability over trials that individual 7 will be in U; is 


Pi; = Edi, (2) 
k 











LOUIS GUTTMAN 89 


where E signifies the expected (i.e., mean) value over the indicated 
subscript. Let P; be the largest of the p;; for individual 7; this will 
be called his modal probability. Then 


py SP, (G=1,2,--°,m;t1=1,2,---); (3) 
and, since > pi; = 1, we have 
j=1 
1 ’ 
P,;2— (i=—1,2,---). (4) 
m 


For a given individual, if for a certain 7 it is true that pi; = P; , then 

U; is a modal category for that person. When there are two or more 

modal categories for a person, one of these can be arbitrarily desig- 

nated to be the modal category to represent a “true” value for the 

person. P; is the probability that a person will be at his modal value. 
The reliability coefficient for the i-th person is defined to be 


m 1 
aoe Lumet (5) 
m—l1 m 

From (4) and the fact that P; = 1, we have 
O28; 21. (6) 
The reliability of the item for the population is defined as fol- 
lows. Let a be the mean modal probability for the population: 
a— EP; ‘i (7) 


4 


The reliability coefficient of U for the population is then defined as: 


=— («-= ). (8) 





m—1 m 


From (4) and (7), 


IA 


=a 


1 
— Zz, (9) 
m 
so that 

0Spl. (10) 


Taking expectations over 7 for both members of (5) and using (7), 


we find that 


that is, the reliability coefficient for the population is the mean of the 
individual reliability coefficients. 








90 PSYCHOMETRIKA 


7. Derivation of the Marginal Lower Bound 
The marginal relative frequency of a category is the proportion 
of observations falling into that category. Let A; be the marginal 
relative frequency of U; over all trials and persons: 


A; = EE@; ;,. (12) 
ik 


From (2), (12) can be written as 
A; = Epi; ’ (13) 


so that from (3), A; = EP; for all j; or, from (7), 

Ajee (= 1,2,---;m). (14) 
Inequality (14) is the basis for our first lower bound. It states that a 
is not less than any of the marginals, so that in particular a is not 
less thon the largest marginal. 


Let A be the largest of the A;. Then from (8) and (14), we have 
a lower bound to the group reliability coefficient: 


Or Ae 
i ; (15) 
m—1 m 





Each A; is defined over all trials as well as persons, and it is 
desirable to bound a from but a single trial. In §10 it is shown that 
the marginal] for U; on a single trial is actually equal to A; (except 
possibly for an infinitesimal proportion of the trials), provided that 
the variation from trial to trial of each person is independent of the 
variation of everyone else. In practice, then, the marginal from a 
single trial can be used as being equal to A;; and the largest margi- 


nal from a single trial can be used for A. 


8. Derivation of the Lower Bound from Joint Occurrences 
Let V be an item with the n categories V,, V.,---, Vn, and let 
1 if individual 7 is in V, on trial k 
= 0 otherwise. (16) 
Let qi, be the probability over trials that individual i will be in V,: 


Gig = Ebi. (17) 
k 


Then clearly 


> dip = 1. (18) 


g=1 








LOUIS GUTTMAN 91 


The proportion of joint occurrences of U; and V, over all trials and 


persons is 
Cig = EEQj jx Digk- (19) 
ik 


The two items are said to be experimentally independent* for the 
ith person if, for the given 7, 


Ed; jx Digk = Di; Vig (7=1,2,---,m; (20) 
k g=—1,2,---,%). 


Equation (20) states that, for individual 7, the probability of the 
joint occurrence of U; and V, (over the trials) is the product of the 
separate probabilities. If (20) holds for all 7, then taking expecta- 
tions over i and using (19), we have 


Ciy = EDi; diy (21) 


From (3) and (21), 


Cig = EP; Gig (g7=1,2,---,m; (22) 
i g=—1,2,---,m). 


Let C, be the largest value of C;, for fixed g , so that 


C,, =C, (g7=1,2,---,™m (23) 
g—1,2,-:-,M). 
Now, (22) states in particular that 
C, SEP (g=1,2,---,m). (24) 


Summing both members of (24) over g and using (18), we have 


'C, a 8P\=<«, (25) 
g=1 i 
which provides the desired lower bound: 


to (36,-= ). (26) 


m—1\ 5-1 m 


IV 





That each of the C, , and hence their sum, is observable from but 
a single trial is proved in §10. 

The lower bound in (26) is better than the marginal lower bound 
in (15), a fact which can be seen as follows. Summing both members 
of (19) over g and using (18) and (12), we have 


* For further discussion of experimental independence, see (4, 263-265). 








92 PSYCHOMETRIKA 


A; =3Cy. (27) 
Hence, from (23) Bc 
A;S35C, (G=1,2,--,m), (28) 
so that in particular ici 
A<3C,. (29) 


9. The Upper Bounds From Two Trials 

The problem of two trials of a single item can be handled by re- 
garding the item V of the preceding section to be a retrial of U, so 
that expectations over k are interpreted as over a universe of pairs 
of trials. Then m= 7, U; = V;, and p;; = qi; , so that (21) becomes 

Cig = EDij Dig. (30) 
Equation (30) shows that the table of joint occurrences for the test- 
retest of an item must be symmetric and Gramian. 

The probability that person i will be in U; on both trials is, by 
the multiplicative law of independent probabilities, p?;;. Let ai be 
the probability that person i will be in some one category on both 
trials. Then, by the additive law for mutually exclusive events, 


ni; => D*i;. (31) 


j= 
Let y be the mean probability that a person will remain in the same 
category on both trials: 


y=En. (32) 


+ 


From (30) and (31), 
y=ZCj;. (33) 


Hence, y is observable from a single pair of trials, since the diagonal 
elements C;; are observable, as is shown in the next section. » i8 the 
basis for our two upper bounds. 

From (31), since one of the pi; is P; , we have 


an2P), (t=1,2,---). (34) 


Taking expectations over i, and using (32), 








LOUIS GUTTMAN 93 
y = EP’;. (35) 


Since the mean of squares is not less than the square of the mean, 
EP?, 2 (EP;)?=a?. (36) 


Hence, from (35) and (36),, we have the first upper bound: 


avy. (37) 
To improve on (37), we write (31) as 
ie P+ Spy, (38) 
j 


where >’ indicates summation over the m — 1 non-modal categories. 


j a 
Since the ee 3 of squares is not less than,the square of the mean, 


roy (>’ pi;)? 
Oe: Mp ' = te oe Mere 39 
| 2p j ae ( ) (39) 
Using (38) and (39) and the fact that 5’ p;; = 1— P;, we have 
j 
(1— P,)? s 
~~ (i=1,2,---). (40) 
m—1 


Taking expectations over 7 in (40) and using (36) and (32) yields 
ma? —2a+1 
m—1 





, (41) 


~ 
IV 


whence 
ma? — 2a + [1— (m—1) y] S0. (42) 


From (42) we obtain the second upper bound for a: 








1+ VV (m—1) (my —1) 
. . 


IIA 


(43) 


a 


10. Observability from a Single Trial 
The proportion of the population observed in U; on trial k is 


Ajx= Bai. (44) 


7 


We shall prove that A; = A; except possibly for an infinitesimal pro- 
portion of the trials, so that each of the total marginals is observable 








94 PSYCHOMETRIKA 


from but a single trial. In particular, then, A will be equal to the 
largest marginal observed on a single trial. 

The proof consists of showing the variance of A; is zero over 
all trials. Taking expectations over k for both members of (44) and 


using (12) we have 
EA;, = 4A;. (45) 
k 


Then the variance over trials of A;, is 
o?; = EA?;,— A?;. (46) 
k 


To evaluate the right member, it is convenient first to consider a 
finite population of N individuals and then to take the limit as N > ©. 


1 N 
For finite N , the operation E is the operation — 5 . Then from (44), 


4 $=1 
,. 


we can write 


ae 


We assume that what one person does on the item is experimentally 
independent of what anybody else does on the item: 


Ey jx G5 jx = (EH Oy,j,) (EG jx.) = Dnj Dis (h#At;h,t=1,2,---,N; 
k k k j=1,2,---,m). 


wt ae $ 
At. =( wei ) = — DF D nj Vijn. (47) 


(48) 
If h =7, then of course, since @*;;, = ij , 
Ea? i jx = pij- (49) 
k 


Taking expectations over k in (47) and using (48) and (49) we have 


N 2 
aa it palms | katie 


k 


or, using (13) and combining the last two terms, 
si 1 
EA?; j= A?; + G Epi;(1— pis). (50) 
k i 


The expectation in the second term on the right is always finite, and in 
fact is nonnegative and not greater than 1/4, so that the second term 
has the limit zero as N ~ ©. Hence, lim E'A?;, = A?;, so that from 
(46), No k 


- 








LOUIS GUTTMAN 95 


o’;=0. (51) 


Therefore, A;, is equal to its expectation A; except possibly in an 
infinitesimal proportion of the trials. 

This observability from a single trial implies that the marginal 
proportions of any item based on an indefinitely large population have 
perfect test-retest reliability, regardless of the reliability of the indi- 
viduals in the population. 

More generally, in the joint occurrence table of two or more ex- 
perimentally independent items, each of the joint frequencies ob- 
served on a single trial of a large population is perfectly reliable. 
That is, if 

Cigx = EBQj i, Dig, (52) 
and if o?,; is the variance of C;,; over trials, then 
"4; = EC? iox a Cig =0, (53) 
k 


and 
Cigx = Cig (54) 


except possibly for an infinitesimal proportion of the trials. The 
proof is identical in form for that of the marginals. 

Therefore, both the marginal lower bound and the lower bound 
based on joint occurrences are observable from but a single trial; and 
the upper bounds are observable from but a single pair of trials. 


REFERENCES 


1. Guilford, J. P. Fundamental statistics in psychology and education. New 
York: McGraw-Hill, 1942. 

2. Guttman, Louis. An outline of the statistical theory of prediction, in P. 
Horst, et al., The prediction of personal adjustment. New York: Social Sci- 
ence Research Council, 1941. 

8. Guttman, Louis. A basis for scaling qualitative data. Amer. Sociol. Rev. 1944, 
9, 189-150. 

4. Guttman, Louis. A basis for analyzing test-retest reliability. Psychometrika, 
1945, 10, 255-282. 








PSYCHOMETRIKA—VOL. 11, NO. 2 
JUNE, 1946 


THE SIGNIFICANCE OF DIFFERENCE BETWEEN MEANS 
WITHOUT REFERENCE TO THE FREQUENCY 
DISTRIBUTION FUNCTION 


LEON FESTINGER* 


RESEARCH CENTER FOR GROUP DYNAMICS 
MASSACHUSETTS INSTITUTE OF TECHNOLOGY 


Most existing tests for significance of difference between means 
require specific assumptions concerning the distribution function 
in the parent population. The need for a test which can be applied 
without making any such assumption is stressed. Such a statistical 
test is derived. The application of the test involves converting scores 
to rank orders. The exact probabilities may then be calculated for 
specified differences between samples by means of which the null 
hypothesis may be tested. The application of the test is simple and 
requires a minimum of calculation. The test loses in precision be- 
cause of the conversion to rank orders but gains in generality since 
it may be safely used with any kind of distribution. 


1. Introduction 


From the point of view of a research worker one of the main 
problems concerning any statistical technique is its applicability to 
the particular set of data which he has on hand. From this view- 
point the simple problem of testing for significance of difference be- 
tween means becomes somewhat involved. He must examine the con- 
ditions under which he gathered his data and the characteristics of 
his data, and then decide which if any statistical techniques are avail- 
able for use. 

(a) He may have a fairly large number of cases in his samples. 
Under these conditions he may be reasonably certain that a distribu- 
tion of means from samples of this size would approach normality. 
The classical critical ratio could then usually be applied without fear 
of excessive error. There are instances, however, where this proce- 
dure will result in error. Holzinger and Church (4) have shown, for 
example, that for bimodal populations the distributions of means of 
samples do not approximate normality even with a large number of 
cases in the samples. Such instances would be relatively rare, how- 


ever. 
(b) He may have a relatively small number of cases in his 


*This study was started at the Iowa Child Welfare Research Station of the 
State University of Iowa. I should like to express my gratitude for the help I 
received there. 


97 











98 PSYCHOMETRIKA 


samples but know with a fair degree of certainty what the distribu- 
tion form of the parent population is. He may know this from pre- 
vious studies or even from a priori logical considerations. If the par- 
ent population is known to approximate normality, then the ‘‘t”’ test 
may be applied precisely and accurately. If the samples are drawn 
from populations with known skewness, he may still be able to apply 
exact tests to determine the statistical significance of the difference 
between means (2, 3, 7). Even if the samples are drawn from a 
population with a rectangular frequency distribution, the research 
worker is not left without recourse (5). 

If there is no exactly applicable test available the research work- 
er will probably apply one which will be in error. He is, moreover, 
in relative ignorance of the extent and direction of the error involved. 
The situation here is quite similar to the following case. 

(c) He may have a relatively small number of cases in his sam- 
ples with little or no knowledge concerning the distribution form of 
the parent population. ‘In such situations, it is extremely unlikely that 
the distribution form of the parent population can be inferred from 
the sample itself. Unfortunately, the common practice has become to 
apply the “‘t” test in these cases and hope that the error will not be 
too great. Let us examine the likelihood that this procedure will re- 
sult in serious error. 

Most of the empirical work on the question has taken the direc- 
tion of actually drawing a large number of samples of specified size 
from a population which differs to a certain degree and in a certain 
direction from normality. The statistic ‘‘t’” is then calculated for each 
sample, and the obtained distribution of “t” is compared to the theo- 
retical “t’” distribution. The conclusions drawn from this type of 
work must of necessity be specific to the particular population ex- 
amined. 

Most of the studies (6, 10) have dealt with unimodal distribu- 
tions which are symmetrical or else have a simple skewness. The 
major findings are: 

(a) Slight or moderate skews do not yield too much divergence 
from the theoretical ‘“t” distribution. 

(b) Leptokurtic distributions yield marked divergence from the 
theoretical “t” distribution. 

Rider (8, 9) has done some excellent theoretical work on the 
problem, concentrating on sampling from rectangular populations. 
He concludes that serious error would be involved in the use of the 
“t” test on such samples. He attributes the error mostly to the inter- 
dependence of means and sigmas in samples from such populations. 
The assumption of independence of mean and sigma in the “t’” test 











LEON FESTINGER 99 


is not usually stressed very much but is probably one of the major 
sources of error when the test is used with samples from non-normal 
populations. 

Many attempts have been made to use ranking methods to avoid 
assumptions concerning the distribution form (1, 11)/ Some of these 
still apply only to large samples while others are too complicated for 
widespread use. 

In the next section we shall derive a test, using ranking meth- 
ods, which seem to accomplish the desired purpose and which is very 
simple to apply. 


2. Derivation 

In order to avoid making any assumption concerning the exact 
distribution form of the parent population, let us agree to sacrifice 
some of the exactness of our data. Instead of taking account of the 
exact score of each case in the sample, we shall take into account 
only whether one score is greater than or less than another score. We 
can accomplish this by using rank orders instead of the original 
scores. 

If we then have two samples: 


Sample x: 2, , %o,°++5 Xn 
Sample y: Yi, Yo 5 +++ 5 Ym 


we may attempt to test the hypothesis that they were both drawn at 
random from the same population, regardless of what the charac- 
teristics of this population are. 
Tentatively accepting the null hypothesis we may combine sam- 
ple x and sample y and assign rank orders to each case from 1 to n 
plus m. The mean of all these ranks is of course (n + m + 1)/2. 
The mean rank of those cases in sample x will no doubt differ from 
this value by a certain amount. We may now ask the question as to 
what the probability is of obtaining any such specified divergence if 
both samples were drawn at random from the same population. If, 
for the moment, we speak in terms of the sum of ranks, rather than 
the mean rank. we can restate the question as follows: what are the 
probabilities of obtaining any specified sum of ranks of n cases se- 
lected at random from n plus m cases? We must then proceed to state 
the exact probability distributions for sums of ranks for specified 
values of m and n. The manner in which these probability distribu- 
tions were calculated is described below. | 
~~ Let us start with n = 2 and m = 2. We are then dealing with 
ranks in the total group of 1, 2, 3, and 4. We are interested in all 
possible combinations of two of these ranks. If each case is inde- 














100 PSYCHOMETRIKA 


pendent of every other case and equaily likely to be drawn, then each 
combination will be equiprobable. The following combinations are pos- 
: sible: 1 and 2, 1 and 3, 1 and 4, 2 and 3, 2 and 4, 3 and 4. We now see 
that there are one out of six ways to obtain a sum of ranks equal to 
3; one out of six ways to obtain a sum of ranks equal to 4; two out 
of six ways to obtain a sum of ranks equal to 5; one out of six ways 
to obtain a sum of ranks equal to 6; and one out of six ways to obtain 
a sum of ranks equal to 7. This, then, is our probability distribution 
for the sum of ranks of two cases selected at random from four ranked 
cases. Other examples could be illustrated in similar fashion. \ 

It is clear that to calculate the probability distributions in this 
manner for larger numbers of cases would be almost prohibitive. For- 
tunately, there are short cuts available. Let us examine what the dis- 
tribution would be for two cases selected at random from a total of 
five ranked cases (n = 2; m = 3). .The possibilities enumerated 
above would still pertain. There would, in addition, be several other 
combinations possible, namely, 1 and 5, 2 and 5, 3 and 5, 4 and 5: 
The number of ways in which a given sum of ranks could be obtained 
would not be affected for sums of ranks below 6 by the addition of 
the fifth case to the ranks. The totals of 6, 7, 8, and 9 would each be 
augmented by one more possibility. In a similar manner the addi- 
tion of a sixth rank to all the cases would not affect any sums of 
ranks of two cases below 7, and the sums of ranks from 7 through 
11 would each be augmented by one possibility. Thus, the probability 
distributions for 2 cases in one sample and m cases in the other sam- 
ple may be quickly calculated. 

Now we come to the calculation of the probability distributions 
for 3 cases in one sample and m cases in the other. It is clear that 
the distribution for 2 cases selected from 5 ranked ones is identical 
with the distribution for 3 cases selected from 5 ranked ones, except 
for the fact that the frequencies refer to different totals, The lowest 
possible sum of 2 cases is 3, while the lowest possible sum of 3 cases 
is 6. The probability distribution for the sum of three cases selected 
at random from 5 ranked cases may then be copied directly, making 
the appropriate adjustment in the sums to which the frequencies 
apply. 
For the probability distribution of 3 cases selected at random 
from 6 ranked cases (n = 3; m = 38), it is clear that none of the fre- 
quencies will be affected for totals below 9 by the addition of the 
sixth rank. Starting with the sum of ranks equal to 9, the sums of 
the new possible combinations of 3 cases will distribute themselves 
exactly as the distribution for sums of two cases selected from 5 
ranked ones. Thus, to obtain the distribution of sums of 3 cases se- 








LEON FESTINGER 101 


lected at random from 6 ranked cases we have merely to add the dis- 
tribution for 2 out of 5 to the distribution for 3 out of 5, making the 
first addition at sum of ranks equal to 9. 

' Similarly, to obtain the distribution of sums of 3 cases selected 
at random from 7 ranked cases we merely add to the distribution 
of the sums of 3 cases selected at random from 6 ranked cases, the 
distribution of 2 cases selected at random from 6 ranked cases, mak- 
ing the first addition at sum of ranks equal to 10, since this will be 
the lowest sum of ranks affected by the addition of the seventh rank. 
We may thus proceed to build up our distributions. To obtain the 
probability distribution of the sum of 7 cases selected at random from 
18 ranked cases (n = 7; m=11) weadd to the distribution of sums of 
seven cases selected at random from 17 ranked cases the distribution 
of sums of 6 cases from 17 ranked cases, making the first addition of 
frequencies for sum equal to 39, which will be the lowest sum where 
the frequencies will be affected. | 

It is in this progressive manner that the frequency distributions 
were calculated. Throughout the calculation constant checks were 
present. Since the distributions are symmetrical about 


(n+ m+ 1) (n) 
2 


one-half of the distribution checked the other half. 

We must now find a convenient method for summarizing and pre- 
senting these distributions. Instead of dealing with the sums of ranks 
of one of the samples, let us deal with the absolute deviation (d) of 
the mean of the ranks of one sample from the mean of the ranks of 
the total group. The following formula may be used: 


innate Sk, nwntmt+i 
n 2 : 
where 7 is the number of cases in the smaller sample, + m is the 
number of cases in both samples together, and SR,, is the sum of the 
ranks of the cases in the smaller sample. The last term in the equa- 
tion is of course the mean of the n + m ranked cases, From the prob- 
ability distributions we may now calculate the probability that a giv- 
en d or larger will occur by chance. Table 1 and Table 2 show the d 
values necessary for significance at the 1% and 5% levels of confi- 
dence, respectively.* Since the distributions of d are discrete, these 
values do not correspond exactly to the 1% and 5% levels. The val- 





? 





* Complete tables have been set up for the different values of n and m, giving 
P-values for various magnitudes of d within a wide range. These tables can be 
secured for a small fee. 


east eee eee — —— 





aa ees oo Hat 


paerey pS Se 
a 








102 PSYCHOMETRIKA 


ues given are those which just exceed the appropriate level of sig- 
nificance. Thus a d value as great as or greater than the one given 
in the table will be significant at that level of confidence. For 7’s 
from 2 to 12, the tables present values up to (n + m) = 40. For 7’s 
of 13, 14, and 15, the tables present values up to (x + m) = 30. 

The computational steps in the test are quite simple. They are 
as follows: 

(a) Throw both samples together and rank each case. If there 
are ties, assign adjacent ranks at random. 

(b) Obtain the mean rank of the sample with the smaller num- 
ber of cases. 

(c) Calculate d by subtracting (n + m + 1)/2 from the re- 
sult of (b), giving d a positive sign. 

(d) Determine whether this d is equal to or greater than the 
appropriate d values in Tables 1 and 2. 

Thus, for any two samples we have a direct and exact test of 
significance which involves no assumption as to the form of distribu- 
tion of the parent population. 

The null hypothesis to be tested with the procedure should be, 
strictly, that the two samples are from the same population. Be- 
cause of the nature of the statistic d on which the test is based, it is 
believed, however, that the test will be most sensitive to differ- 
ences between means. It seems, therefore, that small p values may be 
taken as indicative of significant differences between means. 

The test does of course lose in precision when the scores are con- 
verted to ranks, but some such loss is inevitable if no statement at all 
is to be made about distribution form. The other limitation involved 
is that the only hypothesis which can be tested is the null hypothesis. 
This does not appear to be a serious limitation since it is the null hy- 
pothesis which is most often in question. 3 

Let us review the assumptions on which the test is based: 

(a) The two samples are drawn at random. 

(b) Thetwo samples are independent. 
These are the only two assumptions involved, The second assumption, 
of course, prohibits the use of the test in cases where pairs of indi- 
viduals are matched. 





108 


LEON FESTINGER 


TABLE 1 
d Values} for the 1% Level of Confidence 


Number of cases in smaller sample 


Mm 6hlUS hh hCU 


10 


9 


et Bsa Bec nt ie Sex § 


Pu: Bam hee Seer Be fee 


SRI RRRSSAENRRevosr 


HHAaSdH Pid isis Sis ssosrer 


SRSHSLAAITSSSRBESIRS 


et Dosa elt ew Teal Bt et i De Ges BR tie pee il ie PE ek ee 4 
pe Bd Bat Bo: Se se DI a Der Des > “ed ew ed ey ed Pe Pal eg 


| Pe Pee ee ek Ger Gad ee Be ee Se = pot Soe bel Py eat ls ee Ba Do: he 


0 ee See ee ee oe es ee oe oe 


fee eR Be ee See Be et Ra ee ee ee | 


BA BSH SKuS sr A SeaAsSsagsaesae 


1 
1 
¥ 
1 
1 
17 12 


SrSRRRERSREASRRESRSU SASS 


west 


es rie Tae ome es De Be, es ee Tee Be ee: Sr Bec pee em tes te Bee | 


19 19 © SO 
Sess 
IO OO OOS AAA MMMM ANNAN Soo 


a[duies 1ad1B] ul sesvd Jo 1equinNy 


* Indicates that the smallest P-value obtainable is greater than .01. 





4 
| 
4 
= 
g 
= 
e 
= 
g 
z 
§ 
= 
F 
3 
a 
£ 
3 
& 
3 
E 
3 
f 
& 
= 
3 
8 
$ 
3 
Fs 
< 


: 
§ 
¥= 
: 
. 
5 
¥ 








15 





10 1 @ 18 


TABLE 2 


PSYCHOMETRIKA 
Number of cases in smaller sample 


a ee 


SSrHSASSSSHSRSZSHH 


RSSHISESAASSESASZRZSS 


Pd bee Set ew be el al et eo: ee ee i ye ee ie ee 


oooo eooec —] So 
SSSRLRRSARLSRSRRZERER 
NAT OD 09 6D 09 OD OD SH SH eH Ht HO 10 109 10 19 16 6 
Ot> +H 1 OO HAS or Le) CO st 4 00 <Ht 
BSRAN TROON OO SHB BEAANTS 
NNN OD 09 OD 09 OD “SH SH SH <H! HL 19 19 19 19 19 69 0 6 
COND SOWMON 190 00 NI 0 LD OANI1LOD O19 COMILO N 


1 a Pe is Bet Sat Bie Bich She "tk Dies Be hes Tee ae ee Set he ee be ee pe 


NNN 09 69 09 09 09 StS SH SH HIN ID NIN IN DOOOORE 
NAISSAISRSABESSAVSRSASRRASES 
NNN 69 09 69 09 0D SH! SH SH Ht SID IN ID LOIN CD OOOOR REE 

| ee Sl Sonorcecr & oko oO oD 


pee ise: ee PT ee: as oe: ee ee Bis ee Re We The. Bet Bom thy Dee bee Bee Dee SA he ers 


MABSAGSRALBSAGSSISZRSASZSSRSSSRS 


e323 8 2 8S 6 ee ee ee eS ee eee. ee |e ee 


SASK SAS SARESRSESAESARLSRSEASESKSE 
NN AN 69 60 09 SH St SH HID ID LOIN OOO EE DDMHHAAGAHSSOS 


d Values} for the 5% Level of Confidence 


Seer 


SAS SS SOS HSH SS SS HOSS SS BSS SBS 


CS ew Be Bes Gee PP ys ee Th A ee em ad re ee he ee ee he ee es Be ee a eee aw | 


MANGO GTS WT SSSSEEKBHHKDESBSOSS oA a 


De ee ee ee Oe ee ee ae ee ee ee Se ee’) ae, a 


Nt te Hee MAO GOMOOPEEHNDARGOSOSHAAAA como HdidiO?S 


NOD SHLD CB 00 DO HIND HIN COE OSs 


104 


Se Be Re oe oe 
1 6 OD sH1D SO & D 
83 oo od oS Gao 


P< re 
FMAM NN ANN Base 
e[duies 1e31e[ Ul sased Jo Jequinyy 





+A d equal to or greater than the value indicated in the table is necessary for significance at the 


* Indicates that the smallest P-value obtainable is greater than .05. 
5% level of confidence. 





10. 


11. 





LEON FESTINGER 105 


REFERENCES 


Dixon, W. J. A criterion for testing the hypothesis that two samples are 
from the same population. Annals math. Stat., 1940, 11, 199-204. 

Festinger, L. An exact test of significance for means of samples from popu- 
lations with an exponential frequency distribution. Psychometrika, 1943, 
8, 153-160. 

Festinger, L. A statistical test for means of samples from skew populations, 
1943, 8, 205-210. 

Holzinger, K. J. and Church, A. E. R. On the means of samples from a 
U-shaped population. Biometrika, 1928, 20a, 361-388. 

Neyman, J. and Pearson, E. S. On the use and interpretation of certain 
test criteria for purposes of statistical inference. Biometrika, 1928, 20a, 
175-240. 

Pearson, E. S. and Adyanthaya, N. K. The distribution of frequency con- 
stants in small samples from non-normal symmetrical and skew populations. 
Biometrika, 1929, 21, 259-286. 

Pearson, K., Stouffer, S. A. and David, F. N. Further application in sta- 
tistics of the TM (a) Bessel Function. Biometrika, 1932, 24, 293-350. 

Rider, P. R. On small samples from certain non-normal universes. Annals 
math, Stat., 1931, 2, 48-65. 

Rider, P. R. On the distribution of the ratio of mean to standard deviation 
in small samples from non-normal universes. Biometrika, 1929, 21, 124-148, 
Sophister. Discussion of smal! samples drawn from an infinite skew popula- 
tion. Biometrika, 1928, 20a, 389-423. 

Wolf, A. and Wolfowitz, J. On a test whether two samples are from the 
same population. Annals math. Stat., 1940, 11, 147-162. 


«™ 








PSYCHOMETRIKA—VOL. 11, NO. 2 
JUNE, 1946 


GENERAL SOLUTION OF THE ANALYSIS OF VARIANCE AND 
COVARIANCE IN THE CASE OF UNEQUAL OR DISPRO- 
PORTIONATE NUMBERS OF OBSERVATIONS 
IN THE SUBCLASSES* 


FEI TsSAo 
NATIONAL CENTRAL UNIVERSITY, CHINA 


In this paper a preview of the problem is given. Then the 
mathematical solutions of estimating the sums of squares and prod- 
ucts of different sources of variation under different assumptions are 
presented. Two kinds of populations from which our samples are 
supposed to be drawn are specified. One is defined as possessing ap- 
proximately the same stratification as our sample; while the other 
is defined as having equal frequencies in the subclasses. For the first 
kind of population, we should use the restrictions of “the weighted 
means.” For the second kind, we should use the restrictions of “the 
unweighted means.” The assumptions of zero interactions and sig- 
nificant interactions are also considered. After working out the 
exact method, two approximate methods with appropriate statisti- 
cal assumptions to be fulfilled are given. 


The Problem and Its Literature 

The analysis of variance and covariance methods usually deal 
with data composed of equal or proportionate numbers of observations 
in the subclasses. In fields connected with human beings, such as edu- 
cation and psychology, unequal representation in each cell of the 
multiple-classification of data is of common occurrence. Even in fields 
such as agriculture and biology it is sometime unavoidable or desir- 
able to have disproportionate numbers of observations in the sub- 
classes. Therefore, the need is very urgent for a systematic formu- 
lation of general methods of attacking the problems under such con- 
ditions. 

In 1931 Fisher described verbally the application of the method of 
least squares to the solution of a two-way table which consists of un- 
equal frequencies. No record of publication has been found (7, 268). 
In 1932, Brandt (1) devised a method for dealing with a 2 X s table 
with disproportionate frequencies. He, however, had not succeeded 
in setting up a mathematical formulation underlying the experimental 

* For a more complete account, see: 
Fei Tsao, General solution of the analysis of variance and covariance in the 


case of unequal or disproportionate numbers of observations in the subclasses. 
Ph.D. Thesis, University of Minnesota, 1945. Pp. 120. 


107 








108 PSYCHOMETRIKA 


condition. About the same time, Yates (11, 12) gave several methods 
of dealing with the problems of unequal frequencies. He presented a 
mathematical formulation of his method of fitting constants. The 
restrictions which were used appear to assume that each subclass of 
a classification contributes the same amount of variation whatever 
the number of observations in the subclass. We may call these re- 
strictions “the unweighted means.”’ Under such conditions the fre- 
quencies of all subclasses in the population are assumed to be equal. 
But this assumption may not be fulfilled in all cases. In education, 
for instance, one might consider the schools and grades under inves- 
tigation as a whole sample from the population of all possible schools 
and grades with similar stratification. We may call these restrictions 
“the weighted means.” Yates (11, 50-60) also presented an approxi- 
mate method of weighted squares of means assuming that interac- 
tions exist. The method is rather tedious. Finally, he gave an ap- 
proximate method of analysis. Although he pointed out that this ap- 
proximation is useful only when the class numbers do not differ very 
greatly, he gave no statistical criterion to test whether or not “the 
class numbers do not differ very greatly.” 

In 1934-5, Snedecor and Cox (6, 7) introduced an approximate 
method which they called the method of expected subclass numbers. 
The first step is to make a quantitative test of the validity of the 
assumption of proportionate subclass numbers by using the ,? cri- 
terion. In retaining the original variance within subclasses, the vari- 
ances for any other components are worked out with the propor- 
tionate expected subclass numbers and expected sums or means in 
the usual manner. As we know, the original variance for “within” 
comes from the original data of unequal or disproportionate frequen- 
cies. The writer questions the validity of retaining the “within” vari- 
ance derived from the original data, while the other variances are 
derived from the adjusted data. This criticism can also be made for 
the approximate method of Yates. 

In 1938, Wilks (10) introduced the use of Kronecker deltas in 
formulating the mathematical presentation and extended it from two 
to n classifications. Basically, however, his solution is not different 
from the solution of Yates. Wald (9), in 1941, considered the case 
of unequal class frequencies in the analysis of variance. He simply 
pooled all sources of interactions with the error. In the meantime, 
Nair (5) presented his solution of the same problem. As a whole, 
his restrictions and results were in general the same as those of 
Yates and Wilks. In 1942, Tsao (8) gave his solution of the analy- 
sis of variance for the problem of unequal or disproportionate sub- 
class numbers. He used the restrictions of “the weighted means.” 








FEI TSAO 109 


He likewise did not go beyond the conditions of the insignificant in- 
teractions. 

Consequently, the writer treats the problem of analysis of vari- 
ance and covariance for unequal or disproportionate representation 
in the subclasses by presenting: 

I. the mathematical solution with corresponding restrictions 
clearly defined ;* 

II. new approximate methods with respective statistical as- 
sumptions to be fulfilled. 


I. THE MATHEMATICAL SOLUTIONS OF THE PROBLEM 


1. Analysis of Variance 
Test of the Significance of Interactions.—We are not considering 
the case of unequal frequencies with only one classification. Let us 
start with the problem of double classification. Let all n observa- 
tions on a variable y be capable of two classifications, say, into col- 
umns and rows. The assumption leading to the analysis of variance 
is that we may write: 


Ysii — M so A, + B, + Ii + Leite (1) 


where s=1,---, p;t=—1,---,q;t=—1,---, si; p denotes the num- 
ber of columns; qg denotes the number of rows; ”,; denotes the num- 
ber of observations in the subclass (s , 7) ; ysi: is the t-th observation 
in the subclass (s , 7) ; M is the general mean; A, is the deviation due 
to the s-th column; B; is the deviation due to the 7-th row; I,; rep- 
resents the influence of the interaction between column and row; and 
X%si: represents the random effects. Let us define: 


S12; = 1. (s=1,---,p) (2) 
DNsi = Ng, ((=1,---, q) (3) 


t 


LDDs = DNs. = DN = N., (s=1,---,p;t=1,---,q) (4) 


si 8 
Toi = Deit (€=1,---, Mi) (5) 
t 
To. = SDSVsit = STi ((=1,---,q;t=1,---, M5) (6) 
4¢ i 
T= SDI sit = ST 5 (s=1,---,p; t=1,---, M5) (7) 
oe 8 


*It should be emphasized here that the mathematical solution of this paper 
is not merely the replication of previous works. All the possible situations which 
may occur in practical cases are considered. 








110 PSYCHOMETRIKA 


P= S541: = SST si = STs. = ST. 
es 2 es 4 8 i (8) 
(s= ,99o Dst—1,---, q,E—1,-+-, Mes) 
; 1 FF 
es. (9) 
Nsi 
‘aes 
Y3. = — (10) 
Ns. 
r ia 
j= —— (11) 
Nj 
i 
j..=—. (12) 


Now the problem of getting the maximum likelihood estimates of M , 
A,, B;, and I,; reduces to minimizing: 


¢ = SS2 (Y¥sis —M — A, — B; — 1, i)? (13) 


sit 


subject to the following restrictions: 


Sn,. A,=0, (14) 
>n., 8B, =0, (15) 
D> 1,5=0. (16) 


These restrictions assume that our sample has approximately the same 
stratification as the parent population and that the unequal frequen- 
cies in the subclasses will affect the values of marginal means and the 
general mean. Therefore, we have to give some weight to each sub- 
class. We may call these restrictions the restrictions of “the weighted 
means.” Using Lagrange’s multipliers to make allowance for these 
restrictions, we have 


=> (Yue —M — A, — B; —1,:)? + 24, Sn,. As 
sit 8 
+ zi. Sn. ; B; 7 2A; SDN: Soi: 


8 ¢ 


(17) 
By the maximum likelihood method,* we get a set of normal equa- 
tions: 


* Here the maximum likelihood method is a process of partial-differentiating 
¢% with respect to each estimated constant. 








FEI TSAO 111 


n..M =T.. 
H+ by Apt Ze: By + Dita +m. & =T,. 
i i 
t.30 + SUAS. 0.4 By + Bada +N.; de ty 
& 8 
% A+ MQ Aet 0: Bet suka + Ns; Ag =T 5; 
aw. Ay =0 
8 
po B; =0 
SD%i Lys = 
es 





We know that A, implies A,,--- , A,; B; implies B;, ---, Bg; and Is; 
implies [,,, +++ , Ip. It is evident that (18) really implies a (4 + p 
+ q + pq)-rowed matrix. It is difficult to obtain the values of M, 
A,, B;, and I,; by using the method of determinants. By the method 
of elimination, however, we immediately know that: 


A, =4,.=4;=0. (19) 


And also we can easily obtain the absolute minimum value of ¢, 
which is the sum of squares due to “within:’’* 





T?,; 

2 =SSTVur -TEB—, (20) 
sit sé si 

with ».. — pq degrees of freedom. Then this estimate can be used 
as a basis of testing the following hypothesis: 


(E is the notation for the expectation of a parameter) i.e., the hy- 
pothesis that there is no influence of interaction between column 
and row. Assuming that H, is true, we obtain, from (17): 
$= TDD (Ysit —M — A, — B;)? + 2e, Sn,: A, + 2e2 Sn.:; Bi, (22) 
64.8 s i 
where ¢, and «, are Lagrange’s multipliers. Likewise, we have another 
set of normal equations: 


*If we use the restrictions of the “unweighted means,”’ i.e., 
2A,=—0, 2B, ==0, =21,,=—0, 


&74 


and follow the similar procedures, we can obtain exactly the same value of sum 
of squares due to “within.” The detailed derivation is omitted here. 


(18) 








112 PSYCHOMETRIKA 





n..M =7 
n, M+ 2,.A, + dn. Bi +. & =T,. 
eer Sa 4. + n.,B; +n.;e—T.; (23) 
>a,. A, =0 
nj. B; =) 
where 
&=—e=0. ; (24) 


By the well-known least squares principle (2, 58-60), the sum of 
squares due to the fitted constants is: 


ace on ee wee F eee +t (25) 


with p + q — 1 degrees of freedom. Consequently, we obtain the 
relative minimum value for ¢ in (22): 
WH TIDY sie — MT... —- ZA, T;. — ZB; ST tg ae (26) 
s &4et 8 i 
with n.. — p — q + 1 degrees of freedom. Substituting for B’s in 
terms of A’s, this reduces to: 
He FTUVVY*sit — VAs (T.. — Vrs G.i) — IN P= Het xi. (27) 


r1 : 
sit 


Similarly, substituting for A’s in terms of B’s, (26) reduces to: 
H =TIDV vie — DBs (Ti — Sai Fs.) — IM. Pe. = 2+ 72. (28) 


sit + 


Let us define: 
Res — 32 9.5 = Q.., (29) 
T 65 — Si Ys. —Q.:. (30) 
We immediately know that: 
=Q;. =0, (31) 


YQ. =0. (32) 


It is interesting to see that from (23) we obtain: 








FEI TSAO 113 


Nsi . 

Ma. As — XSite A= Qu. (s=1,---,p;7=1,---,@), (33) 
which implies » equations with p—1 of them linearly independent. 
Replacing any of them by the restriction (14), then we can solve p 
unknowns by » linearly independent equations. Similarly, from (23), 
we obtain: 

Ngi 

N.i B a > 
which implies q equations with q—1 of them being linearly indepen- 
dent. Replacing any of them by the restriction (15), then we like- 
wise can solve g unknowns by q linearly independent equations. By 
so doing, either from (33) and (27) or from (34) and (28), we ob- 
tain the numerical value of 7° and consequently the numerical value 
of v i.e., the sum of squares due to “interaction” with (p—1) (q—1) 


degrees of freedom. Then the test of significance of interaction is 
given by: 





1, Bi = Q.: (s=1,---,p;1=1,---,q@), (34) 


“ey #£ 


(p—1) (q—1) #2 


1 





with n, = (p—1) (q—1) and n,.=7n..— pq. 


Now we wish to consider the restrictions of ‘‘the unweighted 
means.” The assumption underlying these restrictions is that the par- 
ent population has equal frequencies in the subclasses. Using the 
previous notations, these restrictions are: 


>A,=0, 7 (36) 
SBi=0, (37) 
STI = 0. (38) 


we have no difficulty in getting the same numerical value of 5 as 
the value obtained by using the restrictions of “the weighted means” 
as appeared in (20). Now we wish to know if the values of 4 un- 
der the alternative restrictions are identical. Following the same 
procedure as before, the value of x, can also be expressed as in (27) 
or (28). And again, the least squares estimates of A, can be deter- 
mined by (33), where p—1 equations are linearly independent, and 
the restriction (36). Similarly, the least squares estimates of B; can 








114 PSYCHOMETRIKA 


be determined by (34), where q—1 equations are linearly indepen- 
dent, and the restriction (37). Up to this point, we still cannot say 
that the values of zy’ under the alternative restrictions are identical. 


Let us consider the case of p = 2 and q=q. (33) under the restric- 


tion of (14) reduces to: 








Nii 11.2; 
s,. Ap—F— au Ai— A,) =Q,. 
4 Nii Neo. 
Ne j Ny .N1; 
M,. As — Y — (Mi Ar — A,) =Q:. 
i N.; Nh. 





Solving (39) for A, and A,, we obtain: 














22.0; 
A,= , 
Ni Nei 
| — 
i Nn. 
2. 425. 
A,= ; 
Nj Ne; 
R..22: 
i nN. 


Substituting these values into (27) and using (31), we obtain: 





— , Q.. — ' 
2, = STVVie — —— - Sn Pi Het xe. 
este Nj Nj é f : 

i nN. 


Similarly, (33) under the restriction of (36) reduces to: 


Ni 
%,. Ay —. (1; Ai — 2; Ai) =Q,. 


1 °% 








Noi 
m,. A,—2 (%2; Az — 1; Az) = Qz2. 
i n.; 
Solving (43) for A, and A. , we obtain: 
bk oa 
Ny; Ny 
= 
i N.i 
— Q.. 
ee 


(39) 


(40) 


(41) 


(42) 


(43) 


(44) 


(45) 








FEI TSAO 115 


Substituting these values into (27) and using (32), we obtain: 


2 
ie 


tL. gaa 6 ee Ta P= 2 +e, (46) 
sit > Nyzi Nei a 


4 N.i 





which is identical to (42). We have used the case of p= 2 andq=q, 
without loss of generality in a mathematical sense, to verify that the 
estimated value of ? is identical either using the restrictions of “the 


weighted means” or using the restrictions of “the unweighted means.” 

For the problems of k classifications, the estimates of the sums 
of squares due to various interactions can be worked out in the same 
manner. 


Test of the Significance of Main Effects.—This topic can be treat- 
ed in two subdivisions: (a) the method under the condition that the 
interactions are insignificant; (b) the method under the condition 
that some interactions are significant. 

(a) In the last section, we have shown how to test the signifi- 
cance of interactions. Suppose we have accepted all the hypotheses 
regarding zero interactions; then the problem of testing the main ef- 
fects becomes very simple. For instance, if we have two classifica- 
tions, say p columns and gq rows, we may test the following hypothesis 
by using the same notations which appeared previously: 


Hy: E(A;) =0/E (I.:) = 9, (47) 


i.e., the hypothesis that there is no column effect. From (22) we nad 


write: 
¢ = TDD (Yeir — M — B;)? ’ (48) 
e a4 
subject to restriction of either (14) or (36). Following similar pro- 
cedures, we obtain the relative minimum of ¢ in (48): 


: at Ae + SB (7.4 — Sai Fs.) 
(49) 





X01 — = ZEB" sit 


ee te, 8 
where x is defined as in (20); ad is defined as in (28); and B; is de- 
fined as in (34). We know that x is the sum of squares for “within” 
and that 7* is the sum of squares for “interaction.” Since our as- 
sumption is that there is no interaction effect, the values of x2 and 
x are of the same magnitude. Consequently, they can be pooled to- 








116 PSYCHOMETRIKA 


gether and used as a basis of testing H», (4, 140). Usually, we call 

the pooled sum of squares “the residual.” Then the test of H., is 

given by: 
a..—p—qt+l1 a 


Ol 





Fou = ; 50 
; Se (50) 
with n, = p — 1 and nm, —n..—p—q-+1. Similarly, we can test 
the following hypothesis by the same procedures: 
Hy: E(B;) =0/E (Ui) =0, (51) 


i.e., the hypothesis that there is no row effect. We obtain the valid 
estimate of the sum of squares due to row effects as below: 


Xo, — DATs. — TMi G.i) 5 (52) 


where A, is defined as in (33). Again, the test of H,. is given by: 
Spat 


| Le ‘ 52 
m <a ee (52) 





with n, =q—landn,=—n..—p—q+1. 

By mathematical deductions we can use similar procedures to 
solve the problems of k classifications. 

(b) We have shown that we are able to estimate the sums of 
squares due to various interactions and apply the test of significance 
on the basis of “within.” If we find that some of the interactions, but 
not all, are insignificant, then it is evident that these interactions are 
of the same magnitude as “within.” In order to increase the precision 
of the test of significance, we pool those insignificant interactions with 
the “within.” We may regard the pooled sum of squares as “resi- 
dual,”’ which can be used as a basis of testing the main effects. We 
use the term “residual” in order to distinguish from the term “with- 
in.” It is noted that the interactions of first order* are of immediate 
interest. So, without loss of generality, we can treat the problem as if 
the data have two classifications which are connected with the sig- 
nificant interaction of first order. Let us use the same assumptions 
as in (1). For convenience, however, we may write: 


Ysit = Fsi + Lsit (S=1,°++,p; 1=1 pth t=1 potty Msi), (53) 
where ¥,i; is the ¢-th observation in the subclass (s,7); x.i; is a ran- 
* The reader may refer to the footnote of the paper, P. O. Johnson and F. 
Tsao, Factorial design and covariance in the study of individual educational de- 


velopment. Psychometrika, 1945, 10, p. 187, as to the meaning of the interaction 
of first order. 








FEI TSAO Lié 


dom effect; and &,; is defined as the expected value of Ysi: , i.e., 
Ssi =E (Ysit). (54) 

In this case, we assume that the hypothesis of zero interaction has 
been rejected. We have to develop tests of hypotheses regarding main 
effects assuming # (/,;) # 0. First we wish to test the following hy- 
pothesis: 

Hoy: H(A.) =0/E Usi) #0, (55) 
i.e., the hypothesis that there is no column effect provided that the 
interaction between column and row is existent. The condition in- 
volved in this hypothesis is: 


b.—&.=0 (w=1,--,9—D. (56) 


There are two definitions for this condition: (1) using the restrictions 
of “the weighted means” and (2) using the restrictions of “the un- 
weighted means.” We have shown previously that in testing the sig- 
nificance of interactions we can use either of these two restrictions. 
And also we have shown that in testing the significance of main ef- 
fects we can likewise use either of these two restrictions, if there 
does not exist any interaction. However, this fact does not hold for 
the present situation. For the restrictions of “the weighted means,” 
condition (56) becomes: 


Noni Eni ee > Npi Spi — 0 
Mm. a Np. i 


(57) 


(m=1,°::,p ) 1=1,---,q@). 


Assuming that the hypothesis Ho, is true and subject to the condition 
(57), we may write: 


p — Lae (Ysit am &,;) : 
sit (58) 
} 1 7 ra 1 + ‘ 
- 2>Am ( DNmni ame naa Nypi €5:) ’ 
m Mm. i Np. 3 


where (A,,)’s (m =1,---, p—1) are multipliers of Lagrange. Using 
the maximum likelihood method, we obtain the following normal 
equations: 


mi Am 








Nim Si b aac ai} (59) 
Nm. 
Npi aan 
Ny. En; + aa T63 (60) 
Ny. 
Ny. DNmni Guns — Nm. SNyi E53 — 0 ’ (61) 


i i 


Sebdoue asset aatent ei Aameenr tears 








118 PSYCHOMETRIKA 


where m= 1,---,p—1;i=1,---,q. Solving these equations for the 
values of §i, &mi, and A» and substituting the obtained values into 


(58), we obtain: 


. Mee 





gags Hh a = (Yu- “ee Ret es (62) 
U,V=1, 0.65) 


Then the test of Ho, is given by: 


d.f. of residual cs. 


(63) 





ER ne 
* p—l sum of squares for residual’ 


with n, = p — 1 and n, = d.f. of residual. 
For the restrictions of “the unweighted means,” the condition 
(56) becomes: 

















ZEmi — VSvi = 0 (m=1,-:--, p—1) (64) 
for the hypothesis H,,. Proceeding as before, we have: 
2 — a2 D — a2 2 
Kove ha te he (65) 
where 
4 1 ~ 1 - - 
D= > =( yo 2 (——){SVei — Yas P 
(a,...,6)(¢,d) + * Nai i \ Ni é 
az...#bzc7d ——- 2 
aoe ao 
7 we mee: ™ 1 
ici see 200.) *Gy =) 
a<...<b<e 
a,...50,C=1,..., p (p —_ 1)-fold 
and there are p letters: a,---,b,c,d,ie,c,d, and p — 2 other 


letters. The test of H,, under the restrictions of “the unweighted 


means” is: 
: os 
__4d.f. of residual 01 


oe ° ’ (66) 
p—1 sum of squares for residual 





with n, = p — 1 and n,. = df. of residual. 
Next we wish to test the following hypothesis: 
Hy: E(B) =0/E (si) #0, (67) 








FEI TSAO 119 


i.e., the hypothesis that there is no row effect provided that the inter- 
action between column and row is existent. The condition involved 
in this hypothesis is: 


En —G—g=0 (m=1,---, G1). (68) 


Once again, there are two definitions for this condition. For the re- 
strictions of “the weighted means,” (68) becomes: 





1 a5 eit 
— Dron Fem D%sq §sg— 0 (m= 1,---,q—1; s=1,---,p).(69) 


“m 8 Neg & 


Proceeding as before, we may write: 


(70) 








. 
8q 


1 
$= ae (Ysit a &4;)? + 22pm Dom Fon =t 


sit 


where (pm)’s are Lagrange’s multipliers. Using the maximum likeli- 
hood method, we can obtain the relative minimum of ¢: 





aa we eS 
. pat st —— (Gu — Fev)? = 77a + 2m, (71) 


_ UyPH1, 0005 N.. 


where 772’ is the estimate of the sum of squares due to row effects 
under the present condition. Then the test of Ho. is given by: 


d.f. of residual X02" 
Fo = : ’ (72) 
q—1 sum of squares for residual 





with n, = q—1 and n, = d.f. of residual. 
For the restrictions of “the unweighted means,” the condition 
(68) becomes: 


DSem — D Seq — 0 (n= Be ae q—1) ’ (73) 


and we may write: 


$ — Dae (Ysit vee &,)* + Zep a CS Em aS Nal Déea) ’ (74) 


sit m 8 


where (p’,)’s are Lagrange’s multipliers. Using the same method, we 
obtain the relative minimum of ¢ in (74): 


G 
0? r02"' = Xa + H = 77a + X700"" 5 (75) 


where 








120 PSYCHOMETRIKA 




















. 2 2 AOE Ysa) iy 
(Uyeeey b) (c,d) 8 
az...4b4C70 
a<...<b; c<d = 
Byocegh, C,E=2,..059 (g 2)-fold 
1 1 
Aiesces o (=): 8 9 a _? 
a#...#b ¢czd 
a<...<b; c< od 
er enna (q—1) -fold 


and there are gq letters: a,---,b,c,d,ie,c,d, and g—2 other 
letters. Then the test of Ho. under the restrictions of “the unweight- 
ed means” is given by: 
ra 
d.f. of residual 02’’ 
(76) 


q—1 sum of squares for residual ’ 





Fo = 


with , = q—1 and n, = d.f. of residual. 

If we find some interactions of second or higher order are sig- 
nificant, we can estimate the sum of squares due to each of the cor- 
responding main effects by applying similar procedures. 


2. Analysis of Covariance 

Here, to conserve space, the writer does not give the general 
method of controlling independent variables.* The problem remain- 
ing to be solved is how to estimate the sum of products for each com- 
ponent in the case of unequal or disproportionate frequencies in the 
subclasses. Without loss of generality, let us illustrate the situa- 
tion of one dependent variable with one independent variable to be 
controlled. Let y be a dependent variable and y’ an independent vari- 
able. Assume both of them are capable of classifications indicated 
previously. We are able to estimate the sums of squares for y and 
y of different components under different assumptions. For the analy- 
sis of covariance of y and y’, the valid estimates of the sums of prod- 
ucts for different components can also be obtained by using the maxi- 
mum likelihood method. Assume all the assumptions described pre- 
viously are true. In the problems of two classifications, say p col- 
ens and-¢ rows, let 7’..., Te... 7-4 5 Tass Fes 5 Teas. 8-3 » OR 
7; be the values for y’ corresponding 2 Re, ee, Te A Fe 
%..,%-;, and 9%; for y, respectively. Let us define 7’ as the valid 
estimate of sum of products. By mathematical derivations, we can 

* For this account, the reader may refer to 


P. O. Johnson and F. Tsao, Factorial design and covariance in the study of 
individual educational development, Part II, Psychometrika, 1945, 10, 159-162. 








FEI TSAO 121 


obtain the sum of products for each component by simply changing 
the results of sum of squares in conformity with the following rules: 


(1) Y" sit to Ysit Y sit 

(2) xy to yy 

(3) T?toTT’ 

(4) A,(T,. —Si ¥.:) toA.(T',. — Sn Y's) 


(5) By (7.1: —Sna¥..) to Bs (Ts — Sn 7's.) 


In doing this, the writer wishes to summarize all the estimates of the 
sums of products for different components under different assump- 


tions: 
1. Estimates of “within” and “interaction” 











(a) Within: 
- P yi T's 
fg = TTVLVIcit Y'sit — DD sa eee (77) 
a a 4 si 
(b) Interaction: column X row: 
j Ti T's: ‘ By f ee T 4 
x27 = TD ; —2ZA,(T.. — mi V's) —Z <a (78) 
where A, is the same value as in (27). 
2. Main effects: 
(a) Assuming zero interaction: 
(al) column: 
Se = DB (T".s — Sai Y's.) ’ (79) 
where B; is the same as in (49). 
(a2) row: 
42 =A. (T's. — 31,: y'.i) ’ (80) 
where A, is the same as in (50). 
(b) Assuming significant interaction: 
(1) restrictions of “the weighted means”: 
(b11) column: 








122 PSYCHOMETRIKA 


(b12) row: 





9 *s Ney — a -* -* 
. = - = (Gu — G0) (GF u— F'-0)- (82) 


(2) restrictions of “the unweighted means:” 


(b21) column: 





git tate (83) 
Kor E’ : 
where 
4 1 1 , . i 2 
D'= bs = (—) + 3(— ) Si — FI Sa — Fu) 

(a,...,0) (c,d) i Nai i Ndi i i 

ay... zb04C7d 

a<...<b;ce<d (p—2) -fold 


Byovegh, Cel, 0005 p 


and E is defined as in (65). And there are p letters: a,-:-,b,c,d, 
i.e.,c, @, and p—2 other letters. 








(b22) row: 
ee (84) 
Looe H . 
where 
; ~ 1 1 : : 7 = 
G= SS) E(—) Tse — ea) ST se — Get) 
| oe b) (c,d) 8 Nsa s Nesp s £ 
a#...#b4C2d 
wa.« oe 

and H is defined as in (75). And there are gq letters: a,---,b,c,d, 


i.e., c,d, and g—2 other letters. 
For problems of k classifications, similar rules can be applied. 


3. Analysis of a 2 X q Table 

The writer has given all the possible solutions for the analysis 
of variance and covariance in the problems of unequal frequencies 
in the subclasses. Sometimes he has presented general methods rather 
than special results. Therefore, for the benefit of the non-mathemati- 
cal readers, the writer wishes to present a special case which fre- 
quently occurs in the applied fields, say, 2 columns X q rows. First 
let us define: 








FEI TSAO 123 




















P= ene , (85) 
» 14 24 
i n.i 
SJ Mri Noi Mi Nai, _ 
| 3 - (iso | >{ : Fu — 9x0] 
; $ N.; Ni : 
P= as Wis , (26) 
> 14 v 
i n.; 


and G, G’, and H are defined as in (75), (84), and (75), respectively. 
Then the estimates of the sums of squares and products for different 
components are as follows: 


A. Estimates of the sums of squares: 
1. Within and interaction 
(a) within: 
2 =3ETvu— SS. (87) 


sit si Ns 





(b) Interaction: column X row: 








: T?,; T?.; 
g=Tp—-sT—-P. (88) 
si Ns; i oi 
2. Main effects: 
(a) Assuming zero interaction: 
(al) column: 
x. =p. (89) 
(a2) row: 
T?.; Res 
¢=>—-»> +P. (90) 








- i n.i s N;. 
(b) Assuming significant interaction: 


(1) restrictions of “the weighted means:”’ 


(b11) column: 


hs He. ... . 
2, =—— (th. — a)? (91) 
Tess 








124 PSYCHOMETRIKA 


(b12) row: 


= 2 EE G5) (92) 
— u<v n Y-u Yev)"- 
U,V=1,...,g ee 


(2) restrictions of “the unweighted means”: 
(b21) column: 


| Z (ri — ei) 4 








f 
x = ; (93) 
" Ni 
vy 
= Nii Nei 
(b22) row: 
G 
_ te Fr (94) 


B. Estimates of the sums of products: 
1. Within and interaction: 





(a) within: 
’ , Ts T's 
1,2 = TIDY eit VuZe — (95) 
a s 4 si 

(b) interaction: column X row: 

; cut ot Reke af ' 

z7=22 oa” —FP’, (96) 
. 8 i Ni i N.3 


2. Main effects: 
(a) Assuming zero interaction: 
(al) column: 
4°=P. (97) 


(a2) row: 
oer Tan Oe oe 
x?=> —> +P’. (98) 


sis PF N.; Ne 





(b) Assuming significant interaction: 
(1) restrictions of “the weighted means”: 
(b11) column: 


Ny. Me. ‘ ia - 
12. = . (91. — Geo.) (H's. — Y'2.). (99) 














FEI TSAO 125 


(b12) row: 


c Zz Ney Ney | = 4 - 
— u<v (9 .u— ¥-v) (J.u— J-e)- (100) 


U,V=1,...,q N.. 





(2) restrictions of “the unweighted means:” 
(b21) column: 


= (Yrs — Yo) = (ysis — Yas) 








Loer= . (101) 
N.i 
4 N44 Nei 
(b22) row: 
G’ 
Koo? =F (102) 


Il. THE APPROXIMATE METHODS OF SOLVING 
THE PROBLEM 


1. The Method of Expected Equal Frequencies 
Without loss of generality, we present the case of two classifica- 
tions, say, p columns X q rows. The following steps are given: 
(1) Use the 7? criterion to test the goodness of fit for the equal 
frequency in each subclass. Define: 
—- Mn... 
n=—. (103) 
pq 
Then the test is given by: 
i (Ni ae n) : 
=i (104) 
84 n 


with pq—1 degrees of freedom. 


(2) Convert original measures to the adjusted measures. De- 
fine: 


n 


a3; = 





D (Yess — Yss)?, (105) 

Nsi ¢t : 
which is the adjusted sum of squares for the subclass (s, 7). The 
value of each ¥,; in the adjusted data, however, is the same as that 
in the original data. The other ¥’s are defined as: 


DIYs 


Ys. = , (106) 
q 














126 PSYCHOMETRIKA 








Dysi 
ji=—, (107) 
p 
DE i Lys . ZZYs i 
j..=—_— = —_ =. (108) 
q p Pq 


(3) Calculate the adjusted sum of squares for “within.” This 
can be done by the following equation: 


Adjusted 7?, = >4,;. (108) 


(4) Run the analysis of variance from the adjusted data by fol- 
lowing the usual procedures. 


2. The Method.of Expected Proportionate Frequencies 


Without loss of generality, we present this method by using the 
problem of 2 columns < qg rows. The following steps are involved: 

(1) Using the y? criterion to test the goodness of fit for the 
proportionate frequencies. Define: 


—- %. 











", = ; (109) 
q 
ay tata, (110) 
q 
Then the test reduces to 
= Ny; —N; ‘ Noi — Ne 7 
pe. 4 oe, (111) 
nN, i Ne 


with 2(q—1) degrees of freedom. 


(2) Convert original measures to adjusted measures. Define: 


n 
Oi =— 3 (Grit — Gs)? (112) 
li t 


N> as 
An; =— 2 (Yoit — Goi)? (113) 


which are the adjusted sums of squares for the subclasses (1,7) and 
(2, 7%), respectively. The value of each 7,; in the adjusted data, how- 








FEI TSAO 127 


ever, is the same as that in the original data. The other y’s are de- 
fined as: 








Lhri 
j..=—-,, (114) 
q 
LV2i 
j2. =——, (115) 
q 
Yui + Yoi 
~=————, 116 
. (116) 
zy: Yi. + He. = (ii fs Y2i) 
ee ei se , 117) 
y ; , 2 ( 


(3) Calculate the adjusted sum of squares for “within.” This 
can be done by the following equaticn: 


Adjusted 77, = Sa,; + Sai. (118) 


(4) Run the analysis of variance from the adjusted data by 
following the usual procedures. 

The approximate methods for the analysis of covariance are 
basically the same as those for the analysis of variance. They need 
not be presented here. 

In conclusion, the writer would like to extend his thanks and ap- 
preciation to Professors Palmer O. Johnson and Robert W. B. Jack- 
son for valuable aid in the conduct of this study.* 


REFERENCES 

1. Brandt, A. E. The analysis of variance in 2 X s tables with disproportion- 
ate frequencies. J. Amer. Stat. Assoc., 1933, 28, 164-173. 

2. Johnson, P. O. and Neyman, J. Tests of certain linear hypotheses and their 
application to some educational problems. Stat. Res. Memoirs, 1936, 1, 57-98. 

3. Johnson, P. O. and Tsao, F. Factorial design and covariance in the study of 
individual educational development. Psychometrika, 1945, 10, 133-162. 

4. Johnson, P. O. and Tsao, F. Factorial design in the determination of differ- 

ential Jimen values. Psychometrika, 1944, 9, 107-144. 

Nair, K. R. A note on the method of fitting constants for analysis of non- 

orthogonal data arranged in a double classification. Sankhya, Indian J. Stat., 

1941, 5, 317-318. ; 

6. Snedecor, G. W. The method of expected numbers for tables of multiple clas- 
sification with disproportionate subclass numbers. J. Amer. Stat. Assoc., 
1934, 29, 389-393. 


Oo 





* Applications have been made for each of the possible solutions reported here 
to research problems in psychology, education, and biology. A paper reporting 
these applications will be presented later. 





128 PSYCHOMETRIKA 


=] 


Snedecor, G. W. and Cox, G. M. Disproportionate subclass numbers in tables 

of multiple classification. Research Bulletin No. 180, Ames, Iowa, 1935. 

Pp. 272. 

8. Tsao, F. Tests of statistical hypotheses in the case of unequal or dispropor- 
tionate numbers of observations in the subclasses. Psychometrika, 1942, 
7, 195-212. 

9. Wald, A. On the analysis of variance in case of multiple classifications with 
unequal class frequencies. Annals math. Stat., 1941, 12, 346-350. 

10. Wilks, S. S. The analysis of variance and covariance in non-orthogonal data. 
Metron, 1938, 8, 141-154. 

11. Yates, F. The analysis of multiple classifications with unequal numbers in 
the different classes. J. Amer. Stat. Assoc., 1984, 29, 51-66. 

12. Yates, F. The principles of orthogonality and confounding in replicated ex- 

periments. J. agric. Science, 1933, 23, 108-145. 











PSYCHOMETRIKA—VOL. 11, NO. 2 
JUNE, 1946 


A NEW METHOD FOR ANALYZING AESTHETIC PREFER- 
ENCES: SOME THEORETICAL CONSIDERATIONS* 


E. A. PEEL 


LONDON UNIVERSITY INSTITUTE OF EDUCATION 


The aesthetic preferences of a group of persons are obtained 
from their orders of sets of pictures and patterns according to 
‘“liking.”’ The same pictures are ordered independently by a team of 
experts, according to certain artistic criteria such as naturalism, 
composition, color, rhythm, etc. The orders of preference and orders 
according to the criteria are compared by correlation and matrices 
of correlation formed from (1) correlations between the persons’ 
orders of preference; (2) correlations between the orders of pref- 
erence and orders according to artistic criteria; and (3) correlations 
between the criterion orders. These matrices are symbolised by R, 
R,, and R,, respectively, and combined to form a single matrix 


R, R, 
R, R, ; 


Three interesting analyses of this matrix are suggested: Analysis 
of the whole matrix into its factors and rotation of the factors about 
the criteria, regression estimates of individual preferences on the 
artistic criteria, and regression estimates of the person preference 
factors on the same criteria. Theoretical conditions and consequences 
of these analyses are then discussed by the use of matrix notation. 


I. Method 


The method consists of an assessment of aesthetic preferences 
in terms of the artistic qualities of pictures and designs as those qual- 
ities were estimated by competent judges. A detailed description of 
the experimental technique involved is given elsewhere (1, 2). Brief- 
ly, however, the assessment is made in the following manner. A test 
is composed of thirty-one items, pictures or patterns, which are 
classed by the persons tested in their order of preference or “liking.” 
The items are distributed in a frequency distribution which conforms 
approximately to that of the normal type, so that the Pearson prod- 
uct moment coefficient may be used when the orders of preference are 
correlated. 

The same items are also classed in the same distribution by a 
team of expert judges in accordance with certain artistic qualities 
such as “naturalism,” “composition,” “color,” “balance,” etc. The par- 
ticular qualities selected for each test depend, of course, upon the 
nature of the pictures or patterns making up the test. These artistic 


* Part of a thesis approved by the University of London for the degree of 
Ph.D. in Education in May, 1945. 


129 








130 PSYCHOMETRIKA 


qualities as they are estimated by the experts are named the criteria. 
If, in their arrangement of the test items according to any particular 
criterion, the judges show sufficient agreement, then their opinions 
are averaged, and an order of the items of the test is obtained which 
corresponds to the criterior. in question. 

Thus for any given test there are as many orders of preference 
as there are persons tested and as many orders according to criteria 
as there are artistic qualities selected for use with the test. By corre- 
lation of these orders we may obtain three different matrices of corre- 
lation coefficients: 

(1) Correlations between the persons’ orders of preference. 

(2) Correlations between orders of preference and the criteria. 

(3) Correlations between the criterion orders. 

Using matrix notation, these matrices may be combined to give 
a single matrix R , where 

_| R, Ry 
' | RR. 

and FR, denotes the matrix of correlations between the persons’ orders 
of preference, R, is the matrix of correlations between the orders of 
preference and the criteria, and R, is the matrix of correlations be- 
tween the criteria themselves. For any particular test the three sub- 
matrices of correlations, R,, R,, and R., are obtained from rankings 
of the same items, and statistical analyses which are valid for one 
type are therefore valid for the others or for combinations of all 
three: such a combination is found in Rk. 

Three interesting statistical possibilities are suggested by the 
matrix of correlations R . 

(1) In the first place it would be possible to analyze the entire 
matrix R into its factors and so obtain general and bipolar factors 
entering into the correlations between both persons and criteria com- 
bined. The axes of reference could then be rotated about the vectors 
corresponding to the criteria, and from inspection of the rotated load- 
ings, the influence of the criteria in the persons’ aesthetic choice could 
be assessed. 

(2) Secondly, the regression estimate of each persons’ prefer- 
ence on the artistic criteria could be obtained, and in this way the 
influence of each criterion on each person could be estimated and the 
extent to which the criteria account for each person’s preference could 
be assessed. 

(3) Lastly, by first analyzing the submatrix R, of preference 
intercorrelations into its factors, and then obtaining the estimate of 
these factors on the criteria it would be possible to identify the fac- 
tors which characterize the group of persons as a whole. 








E. A. PEEL 131 


Il. Factorial Analysis of the Matrix = 


The centroid method of factor analysis is used, and as these fac- 
tors are not necessarily the best factors to consider for purposes of 
interpretation, rotation of the factors about the artistic criteria is 
utilized as far as possible for interpretation of the results. This raises 
the question of what properties the criteria should possess in order 
to fulfil the function assigned to them. 

One such ideal condition should be that rotations of the centroid 
factors are possible such that there should be one and only one sig- 
nificant criterion loading for any one factor. This condition is a 
special case of Thurstone’s “simple structure” (3, 156). He seeks “a 
structure in which each trait vector is contained in one or more of 
the r-orthogonal co-ordinate hyperplanes” (3, 151). In this research 
we seek a structure such that each criterion vector lies on an orthog- 
onal axis. 

A second condition is that the criteria should possess communal- 
ities at least as high as those of the persons, for the greater the vari- 
ance due to the criteria which is revealed by the factors extracted, 
the better is their selection justified as criteria of the persons’ pref- 


erences. 
Some consequences can be deduced from these properties of ideal 


criteria: 


(1) The criteria are uncorrelated. 
If there are p persons and ¢ criteria, then the matrix of general 


factor loadings is given by 


M, 
Me sxc Lp OSC. 
M. 


By virtue of the first property, the submatrix M, is square and is 
formed by permuting the columns of a diagonal matrix. The reduced 


correlation matrix is given by 
Ss M, , , porn M, M', M, M'. 
r=| M. [M’, M'.] = [ M.M', M.M'. | ? 
and the non-diagonal cells of M,.M’., the matrix of correlation co- 
efficients between the criteria, are zero, for the product of a diagonal 


matrix, which has been permuted by columns, and its transpose, is a 
diagonal matrix. Hence the criteria are uncorrelated. 


(2) There are as many uncorrelated criteria as factors extracted. 








132 PSYCHOMETRIKA 


This is immediately implied in the properties of the matrix M, 
stated above. 


(3) The correlations of the persons with each criterion are propor- 
tional to their loadings in that factor in which the criterion is 
non-zero. 


This follows from an examination of the cells of the matrix 
M,M'.. Each correlation coefficient between a person and a criterion 
is given by an equation 7 = mc, where ¢ is the square root of the 
communality of the criterion and m is the loading of the person in 
that factor. 


(4) If, moreover, the criterion communalities approach unity, 
that is to say, M. approaches J permuted by columns, then the per- 
sons’ factor loadings approximate to the value of the correlation co- 
efficients between persons and criteria, that is to say, M, approximates 
to R,. 


III. Regression Estimate of the Persons’ Individual Preferences 


Referring to the matrix of correlation coefficients 
_|[ #, Ry 
R=| R, laste, 
the preference of a single person, symbolized by z, , can be estimated 
on the criteria used in conjunction with the test by the regression 
equation 


Zs — o s6 Ze c ,’ (1) 


where 7’; is the row vector of the correlation coefficients between the 
person’s preference and the criteria, which are designated by the col- 
umn vector c. 
For the column vector z of all the persons tested, the estimate 
becomes 
sR Rec. (2) 


The regression estimates given in Equation (2) may be readily 
obtained, for the matrix coefficient R, R-. is easily computed in one 
stage by the method due to Aitken (see Thomson, 4, 307). In the case 
above this consists in the pivotal condensation of the matrix 


Band Beak 
Bree oo 








E. A. PEEL 133 


antil all entries to the left of the vertical partitioning line have been 
cleared. 


IV. Regression Estimate of the Aesthetic Factors Which Characterize 
the Preference of a Group of Persons 


In order to estimate the factors which characterize the prefer- 
ence of a group of persons, we apply the argument used by Thomson 
in estimating test and ability factors (5, 41) to correlations between 
persons. 


We first analyze the submatrix R, of | 4 = | 


into its factors and obtain the person specification equations 

z=M,f, (3) 
where z is normalized and symbolizes the preferences of the persons 
characterized as a group by the factors f. M, includes common and 
specific factors and has more columns than rows. Hence the product 
M,M’, is nonsingular 

(M, M’,) (M, M’,)* =I, 
from which it follows that 
(M, M',) (M, M’,)-'z=I1M,f=M,f . 


By dropping M, from both sides of this equation, we obtain the esti- 
mate of the factors on the person preferences 


f=M;,(M,M’,)*z 
=M',R;*z, 


(4) 


since, apart from errors, M, M’,=R,. 
If we require only the estimates of the common factors f,, then 
Equation (4) becomes 
fo=MR,*z, (5) 
where M, is partitioned by the common and specific factor loadings 
and is written 


[Mp i My) . 
Thus with respect to the matrix of correlation data 
R, R, 
R’, R. 


we have the two statements given by equations (2) and (4). The 


ayant eee ee ee ie 


ie Saree 


ee 


Be iy 











134 PSYCHOMETRIKA 


first gives an estimate of the persons’ individual preferences on the 
criteria, and the second an estimate on the individual preferences of 
the factors which characterize the group as a whole. 

We now make an important step, by which the estimates given 
by equations (2) and (4) may be combined to give an estimate of the 
preference factors on the artistic criteria. Substituting the estimate z 
obtained by equation (2) in equation (4), we obtain the equation 


f=M RR, Rec, (6) 


which gives the required estimate of the factor f on the artistic cri- 
teria c. We are thus enabled to identify the factors which character- 
ize the aesthetic preferences of the group of persons in terms of the 
artistic criteria used in the tests. 

There are certain implications in Equation (6) which should be 
clearly stated at this stage. The estimate of f in this equation has a 
double character, for it is an estimation derived by substituting an 
estimate of z, given by Equation (2), for z in the equation of estima- 
tion (4). For this reason the double estimate of f is symbolized in 
Equation (6) by #. The possible loss in accuracy which is introduced 
by the double process of estimation is considered to be offset, however, 
by the objective nature of the artistic criteria and the exact mathe- 
matical knowledge of the degree of correspondence between estimated 
factors and the criteria given by the equation of estimation. 

Equation (6) gives the estimates of all the factors, common and 
specific, but if we are interested only in the common factors f, then 
the estimate becomes 


fo— Mp» ROR ROC. (7) 


The task of computing the matrix coefficient M’, R," R, R- is 
not so formidable as first appears, for we may calculate the factor 
M', R,' by the shortened method due to Ledermann (6). 

Of the four terms in the matrix multiplier M’,, R,- R, R-", only 
the first changes under orthogonal rotation of the factors f,. The 
term FR," R, R.- is invariant under such a rotation. We may, there- 
fore, rotate /, about any significant criterion in the column vector of 
criteria c. By such a rotation the estimate 7, = Bc becomes trans- 
formed to a new estimate 


do = HBc, (8) 


where H is orthogonal. 








E. A. PEEL 135 


V. Variances and Covariances of the Person and Factor Estimates 
For the estimates of the person preferences we have 


se R Rc, 
and the variances and covariances of these estimates are given by 
zz =R,R- ce’ ROR’, 
,' (9) 
=R, ROR’, 


since cc’ = R,. 
The variances and covariances of the factors as estimated by 
the regression equations 
a = M'» Ry? z 
are given by 
fof o—M wR," 22’ Ry Myo 
(10) 
= M'p Ry Myo , 


since zz’ = R,. 
If the regression equation 
fo=M'py RR Rec 
is used as the basis of estimation of f,, the covariances are given by 
Fofo=M, RF," RRO ce RoR’, R=" My 
= M', Rk, Ro Ro PR’, Ry My. 
A comparison of the covariances obtained by using Equations 


(10) and (11) is interesting, for it reveals the loss in accuracy in- 
troduced by using the double estimates f, instead of the single esti- 
mates f,. In a particular test used, a set of landscapes, it was found 
that two common factors could be extracted, and the variances and 
covariances of the estimate of these factors, 7, and /,, as given by 
Equations (10) and (11) were, respectively, as follows: 


, _[ 675 .000 57 —f .822 —001 
hFo=| 000.676 oar —001 338 


VI. Reciprocity of the Estimates of Factors and Criteria 


(11) 


By Equation (2) the persons’ individual preferences were esti- 
mated on the artistic criteria. Referring to the matrix 


R, Ry 
es #&@ |" 


i ise Se es HD 


set ene 


ea en it a Ok a a 








136 PSYCHOMETRIKA 


we may also obtain the analogous converse relation giving the esti- 
mate of,c , the artistic criteria, on the person preferences z. 
For a single criterion c, the regression equation is 


C, — tte z* z ’ (12) 
where 7’; is the row vector of person correlations with the criterion 
c,. For the column vector of criteria c the estimate becomes 

=kRA <<. (13) 

By resolving the submatrix R, into its factors, we obtain the spe- 
cification equations for person preference 
z=M,f.. 


Substituting this value of z in Equation (13), we obtain the re- 


lation 
é=R’'".R;“ M, f , (14) 


which gives the estimation of the criteria c on the factors character- 
izing the group of persons tested. This equation is the converse re- 
lation to Equation (6), by which 

f=, 2 RRC. 


It is of interest to compare these estimates of f and c by first rewrit- 
ing Equation (14) in the form 


é= (M', KR," R,)' f . (15) 
If then we write A = M’, R,* R,, the estimates become 
f=AR-"c, (16) 
and 
é=A'f. (17) 


Furthermore, if the criteria are orthogonal R, — 1, for orthogonal 
criteria are uncorrelated and in the regression estimates the diagonal 
cells of KR. are filled with unit entires. Hence R’, = I and equation 
(16) becomes 

f=Ac, (18) 


and there is revealed a reciprocity between the estimates of f and c, 
as given by Equations (17) and (18). 

We may therefore state that if the criteria have zero correlation 
with each other and the matrix of persons’ inter-correlations is ana- 








E, A. PEEL 137 


lyzed into its factors, there exists a reciprocity between the estimates 
of the factors on the criteria and the criteria on the factors, given by 
relations of the form f= Acandt¢= Af. 


REFERENCES 


Peel, E. A. On identifying aesthetic types. Brit. J. Psychol., 1945, 35, 61-69. 
Peel, E. A. Unpublished thesis for Ph.D. in Education, London University. 
Thurstone, L. L. The vectors of mind. Chicago: Univ. Chicago Press, 1935. 
Thomson, G. H. The factorial analysis of human ability. London and Boston, 
1939. 

5. Thomson, G. H. Some points of mathematical technique in the factorial analy- 

sis of human ability. J. exp. Psychol., 1986, 27, 36-54. 

6. Ledermann, W. A shortened method of estimation of mental factors by re- 
gression. Nature, 1988, 141, 650. 


ee 






Sores tte te ai el tg 


Bias coumnccrciee icine 











